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207 Humari Secreted Proteins 

Field oJ( the Invention 

k 

This invention relates co newjty' identified polynucleoddes and the polypepudes 
encoded by these polynucleotides, uses of such polynucleotides and polypeptides, and 
their production. ' 

Backgroify.d of the Invention 

Unlike bacterium, which exist as a single compartment surrounded by a . 
membrane, human ceils and other eucb^yotes are subdivided by membranes into many 
functionally distinct compartments. Each membrane-bounded compartment, or 
organelle, contains different proteins essential for the function of the organelle. The eel] 
uses "sorting signals." which are amino acid motifs located within the protein, to target 
proteins to particular cellular organelles. 

One type of sorting signal, called a signal sequence, a signal peptide, or a leader 
sequence, directs a class of proteins to an organelle called the endoplasmic reticulum 
(ER). The ER separates the membrane -bounded proteins from all other types of 
proteins. Once localized to the ER Vooth groups of proteins can be further directed to 
another organelle called the Go lgi" apparatus. Here, the Golgi distributes the proteins to 
vesicles, including secretory vesicles, the cell membrane, lysosomes, and the other 
organelles. 

Proteins targeted to the ER by a signal sequence can be released into the 
extracellular space as a secreted protein. For example, vesicles containing secreted 
proteins can fuse with the cell membrane and release their contents into the extracellular 
space - a process called exocytosis. Exocytosis can occur constitutively or after receipt 
of a triggering signal. In the latter case, the proteins are stored in secretory vesicles (or 
secretory granules) until exocytosis is triggered. Similarly, proteins residing on the cell 
membrane can also be secreted into the extracellular space by proteolytic cleavage of a 
"linker" holding the protein to the membrane. 

Despite the great progress made in recent years, only a small number of genes 
encoding human secreted proteins have been identified. These secreted proteins include 
the commercially valuable human insulin, interferon, Factor VIE, human growth 
hormone, tissue plasminogen activator, and erythropoeitin. Thus, in light of the 
pervasive role of secreted proteins in human physiology, a need exists for identifying 
and characterizing novel human secreted proteins and the genes, that encode them. This 
knowledge will allow one to detect, to treat, and to prevent medical disorders by using 
secreted proteins or the genes that encode them. 




Summary of the Invention 



The present invention relates to novel polynucleotides and the encoded 
polypeptides. Moreover, the present invention relates to vectors, host cells, antibodies, 
5 and recombinant methods for producing the polypeptides and polynucleotides. Also 
provided are diagnostic methods for detecting disorders ated to the polypeptides, and 
therapeutic methods for treating such disorders. The invention further relates to 
screening methods for identifying binding partners of ".he polypeptides. 

10 Detailed Description 

Definitions 

The following definitions are provided to facilitate understanding of certain 
terms used throughout this specification. 

In the present invention, "isolated" refers to material removed from its original 

15 environment (e.g.. the natural environment if it is naturally occurring), and thus is 
altered i4 by the hand of man" from its natural state. For example, an isolated 
polynucleotide could be part of a vector or a composition of matter, or could be 
contained within a cell, and still be "isolated" because that vector, composition of 
matter, or particular cell is not the original environment of the polynucleotide. 

20 In the present invention, a "secreted" protein refers to those proteins capable of 

being directed to the ER, secretory vesicles, or the extracellular space as a result of a 
signal sequence, as well as those proteins released into the extracellular, space without 
necessarily containing a signal sequence. If the secreted protein is released into the 
extracellular space, the secreted protein can undergo extracellular processing to produce 

25 a "mature" protein. Release into the extracellular space can occur by many 
mechanisms, including exocytosis and proteolytic cleavage. 

As used herein , a "polynucleotide" refers to a molecule having a nucleic acid 
sequence contained in SEQ ID NO:X or the cDNA contained within the clone deposited 
with the ATCC. For example, the polynucleotide can contain the nucleotide sequence 

30 of the full length cDNA sequence, including the 5' and 3' untranslated sequences, the 
coding region, with or without the signal sequence, the secreted protein coding region, 
as well as fragments, epitopes, domains, and variants of the nucleic acid sequence. 
Moreover, as used herein, a "polypeptide" refers to a molecule having the translated 
amino acid sequence generated from the polynucleotide as broadly defined. 

35 In the present invention, the full length sequence identified as SEQ ID NO:X 

was often generated by overlapping sequences contained in multiple clones (contig 
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analysis). A representative clone containing all or most of the sequence for SEQ ID 
NO:X was deposited with the American Type Culture Collection ("ATCC"). As 
shown in Table 1, each clone is identified by a cDNA Clone ID (Identifier) and the 
ATCC Deposit Number. The ATCC is located at 10801 University Boulevard, 
5 Manassas, Virginia 201 10-2209, USA. The ATCC deposit was made pursuant to the 
- terms of the Budapest Treaty on the international recognition of the deposit of 
microorganisms for purposes of patent procedure. 

A "polynucleotide" of the present invention also includes those polynucleotides 
" capable of hybridizing, under stringent hybridization conditions, to sequences contained 
'10 in SEQ ID NO:X, the complement thereof, or the cDNA within the clone deposited with 
the ATCC. "Stringent hybridization conditions" refers to an overnight incubation at 42° 
C in a solution comprising 50% formamide. 5x SSC (750 rriM NaCL 75 mM sodium 
citrate), 50 mM sodium phosphate (pH 7.6), 5x Denhardt's solution, 10% dexiran 
sulfate, and 20 iig/ml denatured, sheared salmon sperm DNA, followed by washing the 

15 filters in O.Lx SSC at about 65°C. 

Also contemplated are nucleic acid molecules that hybridize to the 
polynucleotides of the present invention at lower stringency hybridization conditions. 
Changes in the stringency of hybridization and signal detection are primarily 
accomplished through the manipulation of formamide concentration (lower percentages 
20 of formamide result in lowered stringency); salt conditions, or temperature. For 
example, lower stringency conditions include an overnight incubation at 37°C in a 
solution comprising 6X SSPE (20X SSPE = 3M NaCl; 0.2M NaH 2 P0 4 ; 0.02M EDTA, 
pH 7.4), 0.5% SDS, 30% formamide, 100 ug/ml salmon sperm blocking DNA; 
followed by washes at 50°C with IXSSPE, 0.1% SDS. In addition, to achieve even 
25 lower stringency, washes performed following stringent hybridization can be done at 
higher salt concentrations (e.g. 5X SSC). 

Note that variations in the above conditions may be accomplished through the 
inclusion and/or substitution of alternate blocking reagents used to suppress 
background in hybridization experiments. Typical blocking reagents include 
30 Denhardt's reagent, BLOTTO, heparin, denatured salmon sperm DNA, and 

commercially available proprietary formulations. The inclusion of specific blocking 
reagents may require modification of the hybridization conditions described above, due 
to problems with compatibility. 

Of course, a polynucleotide which bvhriH;~ 
35 as any 3' trr*~ : — 1 
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complementary sketch of T (or U) residues, would not be included in the definition of 
"polynucleotide," since such a polynucleotide would hybridize to any nucleic acid 
molecule containing a poly (A) stretch or the complement thereof (e.g., practically any 
double-stranded cDNA clone). 

The polynucleotide of the present invention can be composed of any 
polyribonucleotide or polydeoxribonucleptide, which may be unmodified" RN A or DNA 
or modified RNA or DNA. For example, polynucleotides can be composed of single- 
and double-stranded DNA, DNA that is a mixture of single- and double-stranded 
regions, single- and double-stranded RNA, and RNA that is mixture of single- and 
double-stranded regions, hybrid molecules comprising DNA and RNA that may be 
single-stranded or, more typically, double -stranded or a mixture of single- and double- 
stranded regions. In addition, the polynucleotide can be composed of triple-stranded 
regions comprising RNA or DNA or both RNA and DNA. A polynucleotide may also 
contain one or more modified bases or DNA or RNA backbones modified for stability 
or for other reasons. "Modified" bases include, for example, tritylated bases and 
unusual bases such as inosine. A variety of modifications can be made to DNA and 
RNA; thus, "polynucleotide " embraces chemically, enzymatically, or metabolically 
modified forms. 

The polypeptide of the present invention can be composed of amino acids joined 
to each other by peptide bonds or modified peptide bonds, i.e., -peptide isosteres, and 
may contain amino acids other than the 20 gene-encoded amino acids. The 
polypeptides may be modified by either natural processes, such as posttranslational 
processing, or by chemical modification techniques which are well known in the art. 
Such modifications are well described in basic texts and in more detailed monographs, 
as well as in a voluminous research literature. Modifications can occur anywhere in a 
polypeptide, including the peptide backbone, the amino acid side -chains and the amino 
or carboxyl termini. It will be appreciated that the same type of modification may be 
present in the same or varying degrees at several sites in a given poiypeptide. Also, a 
given polypeptide may contain many types of modifications. Polypeptides may be 
branched , for example, as a result of ubiquitination, and they may be cyclic, with or 
without branching. Cyclic, branched, and branched cyclic polypeptides may result 
from posttranslation natural processes or may be made by synthetic methods. 
Modifications include acetylation, acylation, ADP-ribosylation, amidation, covalent 
attachment of flavin, covalent attachment of a heme moiety, covalent attachment of a 
Iolulto! "' -"Heotide derivative, covalent attachment of a lipid or lipid derivative, 

-> ~ 'w-linkins, cyclization, disulfide bond 
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formacion of pyrogiutamate, formylation, gamma-carboxylation, glycosyiacion, GPI 
anchor formation, hydroxy lation, iodination, methyiation, myristoylation, oxidation, 
pegylauon, proteolytic processing, phosphorylation, prenylation, racemization, 
selenoylation, sulfation, transfer-RNA mediated addition of amino acids to proteins 

5 such as arginylation. and ubiquitination. (See, for instance, PROTEINS - 

STRUCTURE AND MOLECULAR PROPERTIES, 2nd Ed., T. E. Creighton, W. 
H. Freeman and Company, New York (1993); POSTTRANSLATIONAL 
COVALENT MODIFICATION OF PROTEINS, B. C. Johnson. Ed., Academic 
Press. New York, pgs. 1-12 (1983); Seifter et aL, Meth Enzymol 182:626-646 (1990); 

10 Rattan et aL, Ann NY Acad Sci 663:43-62 (1992).) 

"SEQ ID NO:X" refers to a polynucleotide sequence while "SEQ ID NO: Y" 
refers to a polypeptide sequence, ; both sequences identified by an integer specified in 
Table 1. 

"A polypeptide having biological activity" refers to polypeptides exhibiting 
15 activity similar, but not necessarily identical to, an activity of a polypeptide of the . 

present invention, including mature forms, as measured in a particular biological assay, 
with or without dose dependency. In the case where dose dependency does exist, it 
need not be identical to that of the polypeptide, but rather substantially similar to the 
dose-dependence in a given activity as compared to the polypeptide of the present 
20 invention (i.e., the candidate polypeptide will exhibit greater activity or not more than 
about 25-fold less and, preferably, not more than about tenfold less activity, and most 
preferably, not more than about three-fold less activity relative to the polypeptide of the 
present invention.) 

25 Polynucleotides and Polypeptides of the Invention 

FEATURES OF PROTEIN ENCODED BY GENE NO: 1 

This gene is expressed primarily in melanocytes and, to a lesser extent, in 
testes, ovary, kidney and other tissues. 

30 . Therefore, polynucleotides and polypeptides of the invention are useful as 

reagents for differencial identification of the tissue(s) or cell type(s) present in a 
biological sample and for diagnosis of diseases and conditions, which include, but are 
not limited to, cancer, disorders of neural crest derived cells including pigmentation 
defects, melanoma, reproductive organ defects, and defects of the kidney. Similarly, 

35 polypeptides and antibodies directed to these polypeptides are useful in providing 

immunological probes for differential identification of the tissue(s) or cell type(s). For 
a number of disorders of the above tissues or cells, particularly of the skin. 
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reproductive, and renal systems, expression of chis gene ac significantly higher or lower 
levels may be routinely detected in certain tissues (e.g., cancerous and wounded 
tissues) or bodily fluids (e.g., serum, plasma, urine, synovial fluid or spinal fluid) or 
another tissue or cell sample taken from an individual having such a disorder, relative to 
5 the standard gene expression level, i.e., the expression level in healthy tissue or bodily 
fluid from an individual not having the disorder. 

Tne tissue distribution indicates that polynucleotides and polypeptides 
corresponding to this gene are useful for treating disorders that arise from alterations in 
the number or fate of neural crest derived cells including cancers such as melanoma and 
10 defects of the developing reproductive system. 

FEATURES OF PROTEIN ENCODED BY GENE NO: 2 

This gene is expressed primarily in infant brain and fetal lung. 
Therefore, polynucleotides and polypeptides of the invention are useful as 

15 reagents for differential identification of the tissue(s) or cell type(s) present in a 

biological sample and for diagnosis of diseases and conditions, which include, but are 
not limited to, developmental disorders of the brain or lung. Similarly, polypeptides and 
antibodies directed to these polypeptides are useful in providing immunological probes 
for differential identification of the tissue(s) or cell type(s). For a number of disorders 

20 of the above tissues or cells, particularly of the central nervous and pulmonary systems, 
expression of this gene at significantly higher or lower levels may be routinely detected 
in certain tissues (e.g., cancerous and wounded tissues) or bodily fluids (e.g., serum, 
plasma, urine, synovial fluid or spinal fluid) or another tissue or cell sample taken from 
an individual having such a disorder, relative to the standard gene expression level, i.e., 

25 the expression level in healthy tissue or bodily fluid from an individual not having the 
disorder. 

The tissue distribution indicates that polynucleotides and polypeptides 
corresponding to this gene are useful for treating or diagnosing disorders associated 
with abnormal proliferation ,of cells in the Central nervous system and developing lung. 

30 

FEATURES OF PROTEIN ENCODED BY GENE NO: 3 

This gene is expressed primarily in breast lymph node and to a lesser extent in 
ovarian cancer and chondrosarcoma. 

Therefore, polynucleotides and polypeptides of the invention are useful as 
35 reagents for differential identification of the tissue(s) or cell type(s) present in a 

biological sample and for diagnosis of diseases and conditions, which include, but are 
not limited to, immune responses such as inflammation or immune surveillance for 
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7 

tumors. This gene may be important for inflammatory responses associated with 
cumors. Similarly, polypeptides and antibodies directed to these polypeptides are useful 
in providing immunological probes for differential identification of the tissue(s) or ceil 
type(s). For a number of disorders of the above tissues or cells, particularly of the 
5 immune system, expression of this gene at significantly higher or lower levels may be 
routinely detected in certain tissues (e.g., cancerous and wounded tissues) or bodily 
fluids (e.g., serum, plasma, urine, synovial fluid or spinal fluid) or another tissue or 
cell sample taken from an individual having such a disorder, relative to the standard " 
gene expression level, i.e., the expression level in healthy tissue or bodily fluid from an 
10 individual not having the disorder. Preferred epitopes include those comprising a 

sequence shown in SEQ ID NO: 236 as residues: Lys-45 to Val-50, Lys-69 to Arg-76. 

The tissue -distribution indicates that polynucleotides and polypeptides 
corresponding to this gene are useful for treatment or diagnosis of immune responses 
including those associated with tumor-induced inflammation: 

15 

FEATURES OF PROTEIN ENCODED BY GENE NO: 4 

This gene is expressed primarily in T-cells and T-cell lymphomas. 
Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s) or cell type(s) present in a 

20 biological sample and for diagnosis of diseases and conditions, which include, but are 
not limited to, immunilogical diseases involving T-ceils such as inflammation, 
autoimmunity, and cancers including T-cell lymphomas. Similarly, polypeptides and 
antibodies directed to these polypeptides are useful in providing immunological probes 
for differential identification of the tissue(s) or cell type(s). For a number of disorders 

25 of the above tissues or cells, particularly of T-cells and other cells of the immune 

system, expression of this gene at significantly higher or lower levels may be routinely 
detected in certain tissues (e.g., cancerous and wounded tissues) or bodily fluids (e.g., 
serum, plasma, urine, synovial fluid or spinal fluid) or another tissue or cell sample 
taken from an individual having such a disorder, relative to the standard gene 

30 expression level, i.e., the expression level in healthy tissue or bodily fluid from an 
individual not having the disorder. 

The tissue distribution indicates that polynucleotides and polypeptides 
corresponding to this gene are useful for diagnosing and treating T-cell based disorders 
such as inflammatory diseases, autoimmune disease and tumors including T-cell 

35 lymphomas. 
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FEATURES OF PROTEIN ENCODED BY GENE NO: 5 
This gene is expressed primarily in activated monocytes. 
Therefore, polynucleotides and poiypepddes of the invention are useful as 
reagents for differential identification of the dssue(s) or cell type(s) present in a 

5 biological sample and for diagnosis of diseases and conditions, which include, but are 
not limited to, inflammation, autoimmunity, infection, or disorders involving activation 
of monocytes. Similarly, polypeptides and antibodies directed to these polypeptides are 
useful in providing immunological probes for differential identification of the tissue(s) ' 
or cell type(s). For a number of disorders of the above tissues or cells, particularly of 

10 the immune system, expression of this gene at significantly higher or lower levels may 
be routinely detected in certain tissues (e.g., cancerous and wounded tissues) or bodily 
fluids (e.g., serum, plasma,. urine, synovial' fluid or spinal fluid) or another tissue or ' " - 
cell sample taken from an individual having such a disorder, relative to the standard 
gene expression level, i.e., the expression level in healthy tissue or bodily fluid from an 

15 individual not having the disorder Preferred epitopes include those comprising a 
sequence shown in SEQ ID NO: 238 as residues: Asp- 19 to Arg-3 1 . 

The tissue distribution indicates that polynucleotides and polypeptides 
corresponding to this gene are useful for diagnosing or treating diseases that result in 
activation of monocytes including infections, inflammatory responses or autoimmune 

20 diseases. 

FEATURES OF PROTEIN ENCODED BY GENE NO: 6 

The translation product of this gene shares sequence homology with terminal 

deoxynucleotidyltransferase which is thought to be important in catalyzing the 
25 elongation of oligo- or polydeoxynucleotide chains. 

This gene is expressed primarily in activated human neutrophils. 
Therefore, polynucleotides and poiypepddes of the invention are useful as 

reagents for differential identification of the tissue(s) or cell type(s) present in a 

biological sample and for diagnosis of diseases and conditions which include, but are / 
30 not limited to, cancer, particularly those of the blood such as leukemia and deficiencies 
in neutrophils such as neutropenia. Similarly, polypeptides and antibodies directed to 
these polypeptides are useful in providing immunological probes for differential 
identification of the tissue(s) or cell type(s). For a number of disorders of the above 
tissues or cells, particularly of the cardiovascular system, expression of this gene at 
^5 significantly higher or lower levels may be routinely detected in certain tissues (e.g., 
and wounded tissues) or bodily fluids (e.g., serum, plasma, urine, synovial 
^- another tissue or cell sample taken from an individual having " 
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such * disorder, relative to the standard gene expression level i.e., the expression level 
in healthy dssue or bodily fluid from an individual not having the disorder. 

The dssue distribution and homology to terminal deoxynucleotidvltransxerase 
indicates that polynucleotides and polypeptides corresponding to this gene are useful for 
treatment and differential diagnosis of acute leukemia's. Alternatively, this sene may 
function in the proliferation of neutrophils and be useful as a treatment for neutropenia, 
for example, following neutropenia as a result of chemotherapy. 

FEATURES OF PROTEIN ENCODED BY GENE NO: 7 

The contig exhibits a reasonable homology to the human chorionic gonadotropic 
(HCG) analogue-GT beta-subunit as disclosed in U.S. Patent No. 5,508,261 and PCT 
Publication No. WO 92/2256S. There is a high degree of conservation of the 
structurally important cysteine residues in these identities. 

This gene is expressed primarily in IL-1 and LPS induced neutrophils. 

Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s) or cell type(s) present in a 
biological sample and for diagnosis„of diseases and conditions, which include, but are 
not limited to, diseases of the immune system, including inflammatory diseases and 
allergies. Similarly, polypeptides and antibodies directed to these polypeptides are 
useful in providing immunological probes for differential identification of the ussue(s) 
or cell type(s). For a number of disorders of the above tissues or cells, particularly of 
the immune system, expression of this gene at significantly higher or lower levels may 
be routinely detected in certain tissues (e.g., cancerous and wounded tissues) or bodily 
fluids (e.g., serum, plasma, urine, synovial fluid or spinal fluid) or another tissue or 
cell sample taken from an individual having such a disorder, relative tothe standard 
gene expression level, i.e., the expression level in healthy tissue or bodily fluid from an 
individual not having the disorder. 

The tissue distribution indicates that polynucleotides and polypepddes 
corresponding to this gene are useful" for treatment/diagnosis of diseases of the immune 
system since expression is primarily in neutrophils, and may be useful as a growth 
factor for the differentiation or proliferation of neutrophils for the treatment of 
neutropenia following chemotherapy. 

FEATURES OF PROTEIN ENCODED BY GENE NO: 8 

This gene is expressed primarily in IL-1- and LPS-induced neutrophils. 
Therefore, polynucleotides and polypeptides of the invention are useful as 

reagents for differenual identification of the tissue(s) or cell type(s) present in a-. 
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biological sample and for diagnosis of diseases and conditions, which include,, but are 
not limited to, diseases of the immune system, including inflammatory diseases and 
allergies. Similarly, polypeptides and antibodies directed to these polypeptides are 
useful in providing immunological probes for differential identification of the tissue(s) 
or cell type(s). For a number of disorders of the above tissues or cells, particularly of 
the immune system, expression of this gene at significantly higher or lower levels may 
be routinely detected in certain tissues (e.g., cancerous and wounded tissues) or bodilv 
fluids (e.g., serum, plasma, urine, synovial .fluid or spinal fluid) or another tissue or 
cell sample taken from an individual having such a disorder, relative to the standard 
gene expression level, i.e., the "expression level in healthy tissue or bodily fluid from an 
individual not having the disorder. Preferred epitopes include those comprising a 
sequence shown in SEQ ID NO: 241 as residues: Ser-14 to Pro-22, Leu-43 to Val-53. 

The tissue distribution indicates that polynucleotides and polypeptides 
corresponding to this gene are useful for treatment and diagnosis of diseases of the 
immune system since expression is primarily in neutrophils, and may be useful as a 
growth factor for the differentiation or proliferation of neutrophils for the treatment of 
neutropenia following chemotherapy. 

FEATURES OF PROTEIN ENCODED BY GENE NO: 9 

This gene is expressed primarily in IL-1 and LPS induced neutrophils. 

Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s) or cell type(s) present in a 
biological sample and for diagnosis of diseases and conditions, which include, but are 
not limited to, diseases of the immune system, including inflammatory diseases and 
allergies. Similarly, polypeptides and antibodies directed to these polypeptides are 
useful in providing immunological probes for differential identification of the tissue(s) 
or cell type(s). For a number of disorders of the above tissues or cells, particularly of 
the immune system, expression of this gene at significandy higher or lower levels may 
be routinely detected in certain tissues (e.g., cancerous and wounded tissues) or bodily 
fluids (e.g., serum, plasma, urine, synovial fluid or spinal fluid) or another tissue or 
cell sample taken from an individual having such a disorder, relative to the standard 
gene expression level, i.e., the expression level in healthy tissue or bodily fluid from an 
individual not having the disorder. Preferred epitopes include those comprising a 
sequence shown in SEQ ID NO: 242 as residues: Tyr-22 to His-35. 

The tissue distribution indicates that polynucleotides and polypeptides 
corresponding to this gene are useful for treatment/diagnosis, of diseases of the immune 
system since expression is primarily in neutrophils, and may be useful as a growth 
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faetor for the differentiation or proliferation of neutrophils for the treatment of 

neutropenia following chemotherapy. 

FEATURES OF PROTEIN ENCODED BY GENE NO: 10 
5 This gene is expressed primarily in activated T -cells and to a lesser extent in 

endothelial cells. 

Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s) or cell type(s) present in a 
biological sample and for diagnosis of diseases and conditions, which include, but are 

10 not limited to, immune dysfunctions including cancer of the T lymphocytes and 
autoimmune disorders and inflammation. Similarly, polypeptides and antibodies 
directed to these polypeptides are useful in providing immunological probes for 
differential identification of the tissue(s) or cell type(s). For a number of disorders of 
the above tissues or cells, particularly of the immune system, expression of this gene at 

15 ' significantly higher or lower levels may be routinely detected in certain tissues (e.g., 
cancerous and wounded tissues) or bodily fluids (e.g., serum ? plasma, urine, synovial 
fluid or spinal fluid) or another tissue or cell sample taken from an individual having 
such a disorder, relative to the standard gene expression level, i.e., the expression level 
in healthy tissue or bodily fluid from an individual not having the disorder. 

20 The tissue distribution indicates that polynucleotides and polypeptides 

corresponding to this gene are useful for treatment and diagnosis of immune disorders 
particularly of T-cell origin and may act as a growth factor for particular subsets of T- 
cells such as CD4 positive cells which would make this a useful therapeutic for the 
treatment of HIV and other immune compromising illnesses. 

25 

FEATURES OF PROTEIN ENCODED BY GENE NO: 11 
This gene is expressed primarily in fetal tissue. 

Therefore, polynucleotides and polypepddes of the invention are useful as 
reagents for differential identification of the tissue(s) or cell type(s) present in a 

30 biological sample and for diagnosis of many developmental abnormalities. Similarly, 
polypeptides and antibodies directed to these polypeptides are useful in providing 
immunological probes for differential identification of the tissue(s) or cell type(s). For 
a number of disorders of the above tissues or cells, particularly of the developing fetus, 
expression of this gene at significantly higher or lower levels may be routinely detected 

35 in certain tissues (e.g.. cancerous and wounded tissues) or bodily fluids (e.g., serum, 
plasma, urine, synovial fluid or spinal fluid) or another tissue or cell sample taken from 
an individual having such a disorder, relative to the standard gene expression level, i.e.,. 
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the expression level in healthy tissue or bodily fluid from an individual not having; the 
disorder. 

The tissue distribution indicates that polynucleotides and polypeptides 
corresponding to this gene are useful as a growth factor or differentiation factor for 
particular cell types in the developing fetus and may be useful in replacement or other 
types of therapy in cases where the gene is expressed aberrandy. 

FEATURES OF PROTEIN ENCODED BY GENE NO: 12 

This gene is expressed primarily in T-cells and to a lesser extent in tumor tissue 
including glioblastoma, meningioma, and Wilms tumor. 

Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s) or cell- type(s) present in a 
biological sample and for diagnosis of diseases and conditions, which include, but are 
not limited to, diseases of the immune system including autoimmune conditions such as 
rheumatoid arthritis, inflammatory- disorders and cancer. Similarly, polypeptides and 
antibodies directed to these polypeptides are useful in providing immunological probes 
for differential identificauon of the tissue(s) or cell type(s). For a number of disorders 
of the above tissues or cells, particularly of the immune, expression of this gene at 
significantly higher or lower levels may be routinely detected in certain tissues (e.g., 
cancerous and wounded tissues) or bodily fluids (e.g., serum, plasma, urine, synovial 
fluid or spinal fluid) or another tissue or cell sample taken from an individual having 
such a disorder, relative to the standard gene expression level, i.e., the expression level 
in healthy tissue or bodily fluid from an individual not having the disorder. Preferred 
epitopes include those comprising a sequence shown in SEQ ID NO: 245 as residues: 
Thr-9 to Scr-14. 

The tissue distribution indicates that polynucleotides and polypeptides 
corresponding to this gene are useful for diagnosis/ modulation of immune function 
disorders, including rheumatoid arthritis and inflammatory responses. 

FEATURES OF PROTEIN ENCODED BY GENE NO: 13 

This gene is expressed primarily in placenta and to a lesser extent in fetal liver 
and bone marrow. 

Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identificauon of the tissue(s) or cell type(s) present in a 
biological sample and for diagnosis of hematological disorders. Similarly, polypeptides 
and antibodies directed to these polypeptides are useful in providing immunological 
probes for differential identification of the tissue(s) or cell type(s). For a number of ' 
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disorders of the above tissues or cells, particularly of the hematological" aridTnunuhe ~" 
systems, expression of this gene at significantly higher or lower levels may be routinelv 
detected in certain tissues (e.g.. cancerous and wounded tissues) or bodily fluids (e « " 
serum, plasma, urine, synovial fluid or spinal fluid) or another tissue or cell sample *" 
taken from an indivkiual having such a disorder, relative to the standard sene 
expression level, i.e., the expression level in healthy tissue or bodily fluid from an 
individual not having the disorder. 

The tissue distribution indicates that polynucleotides and polvpeptides 
corresponding to this gene are useful as a growth factor for hemopoietic stem cells or 
progenitor cells in the treatment of chemotherapy patients or kidney disease. 

FEATURES OF PROTEIN ENCODED BY GENE NO: 14 
This gene is expressed primarily in stromal cells. 

Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differenuai identification of the tissue(s) or cell type(s) present in a 
biological sample and for diagnosis of hematapoietic disorders including cancer, 
neutropenia, anemia, and thrombocytopenia. Similarly, polypeptides and antibodies 
directed to these polypeptides are useful in providing immunological probes for 
differential identification of the tissue(s) or cell type(s). For a number of disorders of • 
the above tissues or cells, particularly of the hematapoietic and immune, exoression of 
this gene at significantly higher or lower levels may be routinely detected in certain 
tissues (e.g.. cancerous and wounded tissues) or bodily fluids (e.g., serum, plasma 
urine, synovial fluid or spinal fluid) or another tissue or cell sample taken from an 
individual having such a disorder, relative to the standard gene expression level, i.e., 
the expression level in healthy tissue or bodily fluid from an individual not having the 
disorder. 

The tissue distribution indicates that polynucleotides and polypeptides 
corresponding to this gene are useful as a growth factor for hematapoietic stem cells or 
progenitor cells, in particular following chemotherapy treatment. 

FEATURES OF PROTEIN ENCODED BY GENE NO: 15 

The translation product of this gene shares sequence homology with epsilon- 
COP from Bos taurus which is thought to be important as a component of coatomer, a 
complex of seven proteins, that is the major component of the non-clathrin membrane 
coat. Preferred polypeptides encoded by this gene comprise the following amino acid 
sequences: 

MAPP^GPASGGSGEVDELFDVKNAFY1GSYQQCINE.AXXVKLS.SPERD.V-ERD 
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VTT-YT^^\XAQRJ<FGVVLDEIKPSSAPELQAVT^ 
MSRSXDVmiTFU-^^ 
RLDLARKELKRMQD 
PTLLLUNGQAACHMAQGRWEAAEGL^ 
5 PEVTNR^XSQLKDAHRSHPnKEYQAKENDFDRLVLQYAPSA£AGPELSGP 

(SEQ ID NO:458); or RDVERDVFLYRAYLAQRKFGVVLDEIKPSS APELQ AVRMF 
,ADYL.AHESRRDSIVAELDREMSRSXDVT^ri^ 
QGDSLECTAMTVQILLKLDRL^ 
GEKXQDAYYIFQEMADKCSPTLLI^ 
10 SGYPETLVNLIVLSQHLGKPPEVTNR^XSQLKDAHRSHPFIKEYQAKENDFDRL 
VLQYAPSA (SEQ ID NO:459). 

This gene is expressed primarily in activated monocytes and T-cells, and to a 
lesser extent in multiple other tissues. 

Therefore, polynucleotides and polypeptides of the invention are useful as 
15 reagents for differential identification of the tissue(s) or cell type(s) present in a 

biological sample and for diagnosis of diseases and conditions, which include, but are 
not limited to, immunomodulation, specifically relating to transport problems in these 
cells. Similarly, polypeptides and antibodies directed to these polypeptides are useful in 
providing immunological probes for differential identification of the tissue(s) or cell 
20 type(s). For a number of disorders of the above tissues or cells, particularly of the 

immune, expression of this gene at significantly higher or lower levels may be routinely 
detected in certain tissues (e.g., cancerous and wounded tissues) or bodily fluids (e.g., 
serum, plasma, urine, synovial fluid or spinal fluid) or another tissue or cell sample 
taken from an individual having such a disorder, relative to the standard gene 
25 expression level, i.e., the expression level in healthy tissue or bodily fluid from an 
individual not having the disorder. 

The tissue distribution and homology to epsilon-COP indicates that 
polynucleotides and polypeptides corresponding to this gene are useful for treating 
/diagnosing problems with the cellular transport of proteins that may result in 
30 immunologic dysfunction. 

FEATURES OF PROTEIN ENCODED BY GENE NO: 16 

Tne translation product of this gene shares sequence homology with an RNA 
helicase which is thought to be important in polynucleotide metabolism. The translation 
35 product of this contig exhibits good homology to the LbeIF4A antigen of Leishmania 
braziliensis. The LbeEF4A antigen, or immunogenic portions of it, can be used to 
induce protective immunity against leishmaniasis, specifically L. donovani, L. chagasi. 
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L. infantum, L. major, L. braziiiensis ? L. panamensis, L. tropica and L. guyanensis. It 
can also be used diagnostically to detect Leishmania infection or to stimulate a cellular 
and/or humoral immune response or to stimulate the production of interieukin-12. 
This gene is expressed primarily in colon cancer and to a lesser extent in 
5 pituitary. 

Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s) or cell type(s) present in a 
biological sample and for diagnosis of cancers particularly of the colon. Similarly, 
polypeptides and antibodies directed to these polypepddes are useful in providing 

10 immunological probes for differential identification of the tissue(s) or cell type(s). For 
a number of disorders of the above tissues or cells, particularly of the gastrointestinal 
system, expression of this gene at significantly higher or lower levels may be routinely 
detected in certain tissues (e.g., cancerous and "wounded tissues) or bodily fluids (e.£. ? 
serum, plasma, urine, synovial fluid or spinal fluid) or another tissue or cell sample 

15 taken from an individual having such a disorder, relative to the standard gene 

expression level, i.e., the expression level in healthy tissue or bodily fluid from an 
individual not having the disorder. Preferred epitopes include those comprising a 
sequence shown in SEQ ID NO: 249 as residues: Glu-93 to Ala-98, Glnrl50 to Leu- 
156, Leu-220 to Leu-231, Leu-268 to Arg-273, VaI-324 to Pro-341, Arg-372 to Asn- 

20 ( 380, Ser-405 to Gly-410, Phe-426 to Ala-433, Giu-453 to Asp-470, Arg-506 to Ser- 
. 547. 

The tissue distribution and homology to RNA helicase indicates that 
polynucleotides and polypeptides corresponding to this gene are useful for development 
of diagnostic tests for colon cancer. 

25 

FEATURES OF PROTEIN ENCODED BY GENE NO: 17 

The translation product of this contig has sequence homology to a cytoplasmic 

protein that binds specifically to JNK designated the JNK interacting protein- 1 or JIP-1 

in mice. JIP-1 caused cytoplasmic retention of JNK and inhibition of JNK-regulated 
30 gene expression. 

This gene is expressed primarily in brain including pituitary cerebellum frontal 

cortex, fetal brain and to a lesser extent in the kidney cortex. 

Therefore, polynucleotides and polypeptides of the invention are useful as 

reagents for differential identification of the tissue(s) or cell type(s) present in a 
35 biological sample and for diagnosis of the central nervous system disorders including 

ischemia, epilepsy, Parkinson's disease, and schizophrenia. Similarly, polypeptides 

and antibodies directed to these polypeptides are useful in providing immunological - 
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probes for differencial identification of che cissue(s)-or cell type(s). For a number of 
disorders of che above tissues or cells, particularly of the central nervous system, 
expression of this gene ac significantly higher or lower levels may be routinely detected 
in certain tissues (e.g., cancerous and wounded tissues) or bodily fluids (e.g., serum, 
5 plasma, urine, synovial fluid or spinal fluid) or another tissue or cell sample taken from 
an individual having such a disorder, relative to the standard gene expression level, i.e., 
the expression level in healthy tissue or bodily fluid from an individual not having the 
disorder. Furthermore, the translation product of this contig may suppress the effects of 
the JNK signaling pathway on cellular proliferation, including transformation by the 
10 Bcr-Abl oncogene. Preferred- epitopes include those comprising a sequence shown in" 
SEQ ID NO: 250 as residues: Pro-6 to Ser-26, Ala-30 to Asp-41 , Gly-55 to Ser-6t, 
Gly-74 to Thr-80, Tyr-117 to Ala- 123, Tyr-167 to Asp- 172, Ala-212 to Cys-223, Pro- . 
239 to Tyr-244. 

The tissue distribution indicates that polynucleotides and polypeptides 
15 corresponding to this gene are useful for enhanced survival and/or differentiation of 
neurons as a treatment for neurodegenerative disease. 

FEATURES OF PROTEIN ENCODED BY GENE NO: 18 

The translation product of this gene shares sequence homology with a liver 

20 stage antigen from a protozoan parasite. 

This gene is expressed primarily in fetal tissue and to a lesser extent in activated 
T-cells and other immune cells. 

Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s) or cell type(s) present in a 

25 biological sample and for diagnosis of diseases and conditions, which include, but are 
not limited to, developmental abnormalities and diseases of immune function. Similarly, 
polypeptides and antibodies directed to these polypeptides are useful in providing 
immunological probes for differential identification of the tissue(s) or cell type(s). For 
a number of disorders of the above tissues or cells, particularly of the immune system, 

30 expression of this gene at significantly higher or lower levels may be routinely detected 
in certain tissues (e.g., cancerous and wounded tissues) or bodily fluids (e.g., serum, 
plasma, urine, synovial fluid or spinal fluid) or another tissue or cell sample taken from 
an individual having such a disorder, relative to the standard gene expression level, i.e., 
the expression level in healthy tissue or bodily fluid from an individual not having the 

35 disorder. 
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The tissue distribution and homology to a protozoan antigen indicates that 
polynucleotides and polypeptides corresponding to this gene are useful' for 
tfeatment/immune modulation ofparasitic infections. 

5 FEATURES OF PROTEIN ENCODED BY GENE NO: 19 

Preferred polypeptide encoded by this gene comprise the following polypeptide 
sequences: 

MK,^IGIEPSLATYHHIIRJLFDQPGDPLKRSSFTIYDLMNELMGKRPSPKD 
PDDDK^QSAMSICSSLPJDLELAYQVHGLLKTGDNWKHGPDQHRNTYYSKFF 

10 DLICLNEQroVTLKWYEDLIPSAYFPHSQTMmLLQALDV^NRLEVIPKrvVER 
(SEQ ID NO-.460); and/or KDSKEYGHTFRSDLR£ErLi\lLMARX)KHPPELQVAF 
. ADCAADIKSA\ESQPIRQTAQDWPATSLNCLA:ILFERuA.GRTQEAWK^lLGLFRKH 
•NKIPRSELLNELiVrDSAKVSNSPSQAlEVVELASAFSLPICEGLTQRVMSDF.AINQ 
EQKEALSNLTALTSDSDTDSSSDSDSDTSEGK (SEQ ID NO:461). Polynucleotides 

15 encoding such polypeptides are aiso provided. 

■ This gene is expressed primarily in stromal and CD34 depleted bone marrow 
cells and to a lesser extent in tissue? of embryonic origin. 

Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s) or cell type(s) present in a 

20 biological sample and for diagnosis of diseases and conditions, which include, but are 
not limited to, diseases of hematologic origin including cancers and immune 
dysfunction. Similarly, polypeptides and antibodies directed to these polypeptides are 
useful in providing immunological probes for differential identification of the tissue(s) 
or cell type(s). For a number of disorders of the above tissues or cells, particularly of 

25 the hematapoietic and immune, expression of this gene at significantly higher or lower 
levels may be routinely detected in certain tissues (e.g., cancerous and wounded 
tissues) or bodily fluids (e.g., serum, plasma, urine, synovial fluid or spinal fluid) or 
another tissue or cell sample taken from an individual having such a disorder, relative to 
the standard gene expression level, i.e., the expression level in healthy tissue or bodily 

30 fluid from an individual not having the disorder. Preferred epitopes include those 
comprising a sequence shown in SEQ ID NO: 252 as residues: Ser-28 to Gln-34. 

The tissue distribution indicates that polynucleotides and polypeptides 
corresponding to this gene are useful as a growth factor for hematopoietic stem cells or 
progenitor cells which may be useful in the treatment of chemotherapy patients 

35 suffering from neutropenia. 
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FEATURES OF PROTEIN ENCODED BY GENE NO: 20 

Preferred polypeptide fragments can be found in an alternative open reading 
frame. These preferred polypeptides comprise the amino acid sequence: 
MSSDN^SDEDEDLKLELRRLRDKHLKEIQDLQSRQKHEIESL\TKLGKVPPAV^ 
IPPAAPLSGRRRRPTKSKGSKSSRSSSLGNKSPQLSGNLSGQSAASVLHPQQTL 
HPPGNIPESGQNQLLQPLKPSPSSDNLYSAFTSDGAISVPSLSAPGQGTSSTNTV 
GATVNSQAAQAQPPAMTSSRKGTTTDDLHK^ 

^[NYEGPGMARKFSAPGQLCISMTSNXGGSAPISAASATSLGHFTKSMCPPQQY 
GFPATPFGAQWSGTGGPAPQPLGQFQPVGTASLQNFNISNLQKSISNPPGSNL 
RTT (SEQ ID NO:462); IQDLQSRQKHEIESL^TKLGKVPPAVIIPPAAPLSGRRRR 
PTKSKGSKSSRSSSLGNKSPQLSGNLSGQSAASVLHPQQTLHPPGNIPESGQN 
QLLQPLKPSPSSDNLYSAFTSDGAISVPSLSAPGQGTSST (SEQ ID NO:463); 
TSDGAISVPSLSAPGQGTSSTNTVGATVNSQAAQAQPP.AIV1TSSRKGTFTDDLH - 
(SEQ ID NO:464); KGHMNYEGPGiVLARKFSAPGQLCISMTSNLGGSAPISAAS 
ATSLGHFTK (SEQ ID NO:465); QPLKPSPSSDNLYSAFTSDGAISVPSLSAPG 
(SEQ ID NO:466). Also preferred are polynucleotide fragments encoding these 
polypeptide fragments. 

This gene is expressed in fetal liver and tissues associated with the CNS. 

Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s) or cell type(s) present in a 
biological sample and for diagnosis of diseases and conditions, which include, but are 
not limited to, liver and CNS diseases. Similarly, polypeptides and antibodies directed 
to these polypeptides are useful in providing immunological probes for differential 
identification of the tissue(s) or cell type(s). For a number of disorders of the above 
tissues or cells, particularly of the liver and CNS, expression of this gene at 
significantly higher or lower levels may be routinely detected in certain tissues (e.g., 
cancerous and wounded tissues) or bodily fluids (e.g., serum, plasma, urine, synovial 
fluid or spinal fluid) or another tissue or cell sample taken from an individual having 
such a disorder, relative to the standard gene expression level, i.e., the expression level 
in healthy tissue or bodily fluid from an individual not having the disorder. Preferred 
epitopes include those comprising a sequence shown in SEQ ID NO: 253 as residues: 
Gln-26 to Lys-34. 

The tissue distribution indicates that polynucleotides and polypeptides 
corresponding to this gene are useful for diagnosis and treatment for liver diseases such 
as hepatocellular carcinomas and diseases of the CNS. 
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FEATURES OF PROTEIN ENCODED BY GENE NO: 21 

In an alternative reading frame, this gene shows sequence homology to two 
recently cloned genes, karyopherin beta 3 and Ran_GTP binding protem 5. (See 
Accession Nos. gil2 102696 and gnJIPIDIe32873 1.) The Ran_GTP binding protein is 
related co imponin-beta ? the key mediator of nuclear localization signal (NLS)- 
dependent nuclear transport. Based on homology, it is likely that this gene may activity 
similar to the RAN_GTP binding protein. Preferred polypeptide fragments comprise the 
amino acid sequence: VRVAAAESMXLLiECAX^ 

IGTEPDSDVLSEIMHSFAK (SEQ ID NO:467). Also preferred are polynucleotide 
fragments encoding these polypeptide fragments. 
" This gene is expressed in thymus tissue. 

Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s) or cell type(s) present in a 
biological sample and for diagnosis of diseases and conditions, which include, but are 
not limited to, immune disorders. Similarly, polypeptides and antibodies directed to 
these polypeptides are useful in providing immunological probes for differential 
identification of the tissue(s) or cell type(s). For a number of disorders of the above 
tissues or cells, particularly of the immune system, expression of this gene at 
significantly higher or lower levels may be routinely detected in certain tissues (e.g., 
cancerous and wounded tissues) or bodily fluids (e.g., serum, plasma, urine, synovial 
fluid or spinal fluid) or another tissue or cell sample taken from an individual having 
such a disorder, relative to the standard gene expression level, i.e., the expression level 
in healthy tissue or bodily fluid from an individual not having the disorder. 

The tissue distribution indicates that polynucleotides and polypeptides 
corresponding to this gene are useful for diagnosis and treatment for immune disorders. 

FEATURES OF PROTEIN ENCODED BY GENE NO: 22 

This gene is expressed primarily in prostate and osteoclastoma tissues. 
Preferred polypeptide fragments also comprise the amino acid sequence: 

MEINNQNCHVIDLVRTV^NGVEGLLrFGAFLPES\VT_IGVRCSSEPPK.ALLLIL 
AHS Q KRRLDG WS FIRHL R VH Y C V S LTIHFS (SEQ ID NO:468). Also preferred are 
polynucleodde sequences encoding this polypeptide fragment. 

Therefore, polynucleotides and polypeptides of the invention are useful as. 
reagents for differential identification of the tissue(s) or cell type(s) present in a 
biological sample and for diagnosis of diseases and conditions, which include, but are 
not limited to, bone and prostate diseases, and cancers, particularly of the bone and 
prostate. Similarly, polypeptides and antibodies directed to these polypeptides are 
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useful in providing immunological probes for differencial identification of the.tissue(s) 
or ceil type(s). For a number of disorders of che above tissues or cells, particularly of 
the bone and prostate systems, expression of this gene at significantly higher or lower 
levels may be routinely detected in certain tissues (e.g., cancerous and wounded 
tissues) or bodily fluids (e.g., serum, plasma, urine, synovial fluid or spinal fluid) or 
another tissue or cell sample taken from an individual having such a disorder, relative to 
the standard gene expression level, i.e., the expression level in healthy tissue or bodily 
fluid from an individual not having the disorder. Preferred epitopes include those 
comprising a sequence shown in SEQ ID NO: 255 as residues: Met-1 to Ser-i 1. 

The- tissue distribution indicates that polynucleotides and polypeptides 
corresponding to this gene are useful for diagnosis and treatment for bone and prostate 
disorders, especially cancers of those systems. 

FEATURES OF PROTEIN ENCODED BY GENE NO: 23 

This gene shares sequence homology with the FK506-binding protein (FKBP- 
13) family, a Icnown.cytosolic receptor for the immunosuppressants. Recently, another 
group has cloned a very similar gene, recognizing the homology to FK506-binding 
protein family, calling their gene FKBP23. (See Accession No. 2827255.) 
This gene is expressed primarily in lymphoid tissues. 
Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s) or cell type(s) present in a 
biological sample, especially for those susceptible to immune suppressant therapies and 
for diagnosis of diseases and conditions, which include, but are not limited to, immune 
suppressant disorders. Similarly, polypeptides and antibodies directed to these 
polypeptides are useful in providing immunological probes for differential identification 
of the tissue(s) or cell type(s). For a number of disorders of the above tissues or cells, 
particularly of the immune system, expression of this gene at significantly higher or 
lower levels may be routinely detected in certain tissues (e.g., cancerous and wounded 
tissues) or bodily fluids (e.g., serum, plasma, urine, synovial fluid or spinal fluid) or 
another tissue or ceil sample taken from an individual having such a disorder, relative to 
the standard gene expression level, i.e., the expression level in healthy tissue or bodily 
fluid from an individual not having the disorder. Preferred epitopes include those 
comprising a sequence shown in SEQ ID NO: 256 as residues: Ala- 19 to Val-31, Arg- 
38 to Gly-49, Ala-61 to Lys-66, Tyr-68 to Pro-73, Gly-1 16 to Ala-121, Asp-154 to 
Ser-162, Glu-173 to Gin- 186, Phe-194 to Gly-203. Pro-207 to Val-212. 
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The tissue distribution and homology to FKBP-I2 and -13 indicates that 
polynucleotides and polypeptides corresponding to this gene are useful for diasnosis 
and treatment for immune suppressant disorders. 

5 FEATURES GF PROTEIN ENCODED BY GENE NO: 24 

This gene is expressed primarily in the brain and in the retina. This gene maps 
to chromosome 8, and therefore can be used in linkage analysis as a marker for 
chromosome 8. 

Therefore, polynucleotides and polypeptides of the invention are useful as 

10. reagents for differential identification of the tissue(s) or cell type(s) present in a 

biological sample and for diagnosis of diseases and conditions, which include, but are 
not limited to, neurological and ocular associated disease states. Similarly, polypeptides 
and antibodies directed to these polypeptides are useful in providing immunological 
probes for differential identification of the tissue(s>or cell type(s). For a number of 

15 disorders of the above tissues or cells, particularly of the disorders of the central 

-nervous system, expression of this gene at significantly higher or lower levels may be 
routinely detected in certain tissues (e.g., cancerous and wounded tissues) or bodily 
fluids (e.g., serum, plasma, urine, synovial fluid or spinal fluid) or another tissue or 
cell sample taken from an individual having such a disorder, relative to the standard 

20 gene expression level, i.e., the expression level in healthy tissue or bodily fluid from an 
individual not having the disorder. Preferred epitopes include those comprising a 
sequence shown in SEQ ID NO: 257 as residues: Cys-34 to Asp-40. , 

The tissue distribution in retina indicates that polynucleotides and polypeptides 
corresponding to this gene are useful for the treatment and/or detection of eye disorders 

25 including blindness, color blindness, impaired vision, short and long sightedness, 

retinitis pigmentosa, retinitis proliferans, and retinoblastoma. Expression in the brain 
indicates a role in the is useful for the detection/treatment of neurodegenerative disease 
states and behavioral disorders such as Alzheimer's Disease, Parkinson's Disease, 
Huntington's Disease, schizophrenia, mania, dementia, paranoia, obsessive compulsive 

30 disorder and panic disorder. 

FEATURES OF PROTEIN ENCODED BY GENE NO: 25 

This gene shows sequence homology to a newly identified class of proteins 
expressed in the nervous system, called stathmin family. (See Accession No.,2585991; 
35 see also Eur. J. Biochem. 248 (3), 794-806 (1997).) The stathmin family appears to be 
an ubiquitous phosphoprotein involved as a relay integrating various intracellular 
signaling pathways. These pathways affect cell proliferation and differentiation.- 
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Preferred polypeptide fragments comprise the amino acid sequence: 
QDKHAEEVRKNKELKEEASR (SEQ ID NO:469); QQDLSPWAAPVGCPLXXASX 
TCHXLPLSGCLRRQSXSLPVVAXLCFWFSCPLASLFVPGQPCVTCPFPSLPFQD 
KHAEEVRKNKBLKEEASR (SEQ ID NO:470). Also preferred are che 
polynucleotide fragments encoding these polypeptide fragments. 
This gene is expressed primarily in brain. 

Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s) or cell type(s) present in a 
biological sample and for diagnosis of diseases and conditions, which include, but are 
not limited to, neurological disorders. Similarly, polypeptides and antibodies directed to 
these polypeptides are useful in providing immunological probes for differential 
identification of the tissue(s) or cell rype(s). For a number of disorders of the above 
tissues or cells, particularly of the central nervous system, expression of this gene at 
significantly higher or lower levels may be routinely detected in certain tissues (e.g., 
cancerous and wounded tissues) or bodily fluids (e.g., serum, plasma, urine, synovial 
fluid or spinal fluid) or another dssue or cell sample taken from an individual having 
such a disorder, relative to the standard gene expression level, i.e., the expression level 
in healthy tissue or bodily fluid from an individual not having the disorder. 

The tissue distribution indicates that polynucleotides and polypeptides 
corresponding to this gene are useful for the detection/treatment of neurodegenerative 
disease states and behavioral disorders such as Alzheimer's Disease, Parkinson's 
Disease, Huntintons Disease, schizophrenia, mania, dementia, paranoia, obsessive 
compulsive disorder and panic disorder. 

FEATURES OF PROTEIN ENCODED BY GENE NO: 26 

The polynucleotide sequence of this gene contains a domain similar to a FIG 
ligand peptide. Preferred polypeptide fragments comprise the amino acid sequence: 
PTRCCTTQPCRSSARRPCWVPMVPSPEGREXQPTCPS (SEQ IDNO:471). Thus, 
this gene may have activity as binding to FIG receptors, a process known to promote 
angiogenesis and/or lymphangiogenesis. 

This gene is expressed in human tonsil, and to a lesser extent in 
teratocarcinoma, placenta, colon carcinoma, and fetal kidney. 

Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for identification of the tissue(s) or cell type(s) present in a biological sample 
and for diagnosis of diseases and conditions, which include, but are not limited to, 
diseases of the tonsil, as well as cancers, such as colon, reproductive, and kidney 
cancers. Similarly, polypepddes and antibodies directed to these polypeptides are useful 
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in providing immunological probes for differential identification of the tissue(s) or cell 
cype(s). For a number of disorders of the above tissues or cells, particularly of the 
tonsils, colon, reproductive organs, and kidneys, expression of this gene at 
significantly higher or lower levels may be routinely detected in certain tissues (e.g., 
cancerous and wounded tissues) or bodily fluids (e.g., serum, plasma, urine, synovial 
fluid or spinal fluid) or another tissue or cell sample taken from an individual having 
such a disorder, relative to the standard gene expression level, i.e., the expression level 
in healthy tissue or bodily fluid from an individual not having the disorder. Preferred 
epitopes include those comprising a sequence shown in SEQ ID NO: 259 as residues: 
Pro-22 to Glu-33. ■ ' 

The tissue distribution in tonsil and several cancers and fecal tissues indicates 
that polynucleotides and polypeptides corresponding to this gene are useful for 
diagnosis and treatment. of diseases of the tonsil or colon, such as tonsillitis, 
inflammatory diseases involving nose and paranasal sinuses, especially during the 
infection of influenza, adenoviruses, parainfluenza, rhinoviruses. The gene may also be 
useful in the diagnosis and treatment of neoplasms of nasopharynx or colon origins. 

FEATURES OF PROTEIN ENCODED BY GENE NO: 27 

In an alternative reading frame exists a large open reading frame that encodes a 
preferred polypeptide. Preferred polypeptide fragments comprise the amino acid 
sequence: 

MKRSLNENSARSTAGCLPVPLFNQKKRI^RQPLTSNPLKDDSGISTPSDNYDFP 

PLPTDWAWEAVNPEXAPVMKTVDTGQIPHSVSRPLRSQDSVFNSIQSNTGRSQ 

GGWSYRDGNKNTSLKTWXKNDFKPQCKRTNLVANDGKNSCPMSSGAQQQK 

QLRTPEPPNLSRNKETELLRQTHSSKISGCTMRGLDKNS.ALQTLKPNFQQNQY 

KXQMLDDIPEDNTLKETS^ 

FEVLAVLDSAVTPGPYYSKTELMRDGKNT^ 

NYDQKKNffQCVSVRPASVSEQKTFQA (SEQ ID 

NO:472); SQDSVFNSIQSNTGRSQGGWSYRDGNKNTSLKTWXKNDFKPQCKR 
(SEQ ID NO:473); NKETELLRQTHSS KIS GCTMRGLD KNS ALQTLKPNF (SEQ ID 
NO:474);SSLRHSAVESMKYWREHA 

(SEQ ID NO:475); and PRLIRGRVHRCVGNYDQKKNIFQCVS VRPAS VSEQKT 
FQAFV (SEQ ID NO:476). 

This gene is expressed primarily in human testes. 

Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s) or cell type(s) present in a 
biological sample and for diagnosis of diseases and conditions, which include,; but are 



^ated to, male reproductive disorders, including cancer. Similarly, polypeptides 
and antibodies directed to these polypeptides are useful in providing immunological 
probes for differential identification of the tissue(s) or cell type(s). For a number of 
disorders of the above tissues or cells, particularly of the male reproductive system, 
expression of this gene at significandy higher or lower levels may be routinely detected 
in certain tissues (e.g., cancerous and wounded tissues) or bodily fluids' (e.g., serum, 
plasma, urine, synovial fluid or spinal fluid) or another tissue or cell sample taken from 
an individual having such a disorder, relative to the standard gene expression level, i.e., 
the expression level in healthy tissue or bodily fluid from an individual not having the 
disorder. 

The tissue distribution indicates that polynucleotides and polypeptides 
corresponding to this gene are useful, as a hormone with reproductive or other systemic 
functions; contraceptive development; male infertility of testicular causes, such as 
Kleinfelteris syndrome, varicocele, orchids; male sexual dysfunctions; testicular 
neoplasms; and inflammatory disorders such as epididymitis. 

FEATURES OF PROTEIN ENCODED BY GENE NO: 28 
This gene is expressed primarily in apoptotic T-cell. 

Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s) or cell type(s) present in a 
biological sample and for diagnosis of diseases and conditions, which include, but are 
not limited to, diseases relating to T cells, as well as cancer in general. Similarly, 
polypeptides and antibodies directed to these polypeptides are useful in providing 
immunological probes for differential identification of the tissue(s) or cell type(s). For 
a number of disorders of the above tissues or cells, particularly of the disorders of the 
immune system, expression of this gene at significantly higher or lower levels may be 
routinely detected in certain tissues (e.g., cancerous and wounded tissues) or bodily 
fluids (e.g., serum, plasma, urine, synovial fluid or spinal fluid) or another tissue or 
cell sample taken from an individual having such a disorder, relative to the standard 
gene expression level, i.e., the expression level in healthy tissue or bodily fluid from an 
individual not having the disorder. 

The tissue distribution indicates that polynucleotides and polypeptides 
corresponding to this gene are useful for immune disorders. Moreover, since the gene 
was isolated from an apoptotic ceil and based on the understanding of the relationship 
of apoptosis and cancer, it is likely that this gene may play a role in the genesis, of 



cancer. 
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FEATURES OF PROTEIN ENCODED BY GENE NO: 29 
This gene is expressed primarily in human tonsils. 

Therefore, polynucleotides and polypeptides of the invention are useful as 
5 reagents for differential identification of the tissue(s) or cell type(s) present in a 

biological sample and for diagnosis of diseases and conditions, which include, but are 
not limited to, gastrointestinal disorders. Similarly., polypeptides and antibodies directed 
to these polypeptides are useful in providing immunological probes for differential 
identification of the tissue(s) or cell type(s). For a number of disorders of the above 

10 tissues or cells, particularly of the gastrointestinal system, expression of this gene at 
significantly higher or lower levels may be routinely detected in certain tissues (e.g., 
cancerous and wounded tissues) or bodily fluids (e.g., serum, plasma, urine, synovial 
fluid or spinal fluid) or another tissue or cell sample taken from an individual having 
such a disorder, relative to the standard gene. expression level, i.e., the expression level 

15 in healthy tissue .or bodily fluid from an individual not having the disorder. 

The tissue distribution indicates that polynucleotides and polypeptides 
corresponding to this gene are useful for the diagnosis and treatment of gastrointestinal 
diseases. 

20 FEATURES OF PROTEIN ENCODED BY GENE NO: 30 

The translation product of this gene shares sequence homology with C44C1.2 
gene product of Caenorhabditis elegans with unknown function. Preferred polypeptide 
fragments comprise the amino acid sequence: 

GVFRPCVCGRPASLTCSPLDPEVGPYCDTPTMRTLFNLLWLALACSPVHTTLSK 
25 SDAKK.^SKTLLEKSQFSDKPVQDRGLVVTDLK,AESVVLEHRSYCSAIC\RDRH 
FAGDVLGYVTPWNSHGYDVTKW 

VDQGWMRA VRKHAKGLffl W KT W Q V A 

KNQHFT)GFV"VE VWNQLLS Q KR VGLIHMLTHL AE AL H Q ARLL ALL VIPP AITPGT 
DQLGMFTHKEFEQI^VUM 

30 KXKWRTKSSWGSTSiVIXWTXRXPXD 

PQ (SEQ ED NO.477); TCSPLDPEVGPYCDTPTMRTLFMXWL.\LACSPyHTTLS 
(SEQ ID NO:478); L VWDLKAES VVLEHRS YCS AKARD RKFAGD VLG YVTPW 
NSHGYDVTKVFGSKF (SEQ ID NO:479); REMFEVTGLHDVDQGWMRaVrk 
HAKGLHI\TRLLFEDWTYT)DFRNVLDSEDE (SEQ ED NO:430); HFDGFVVEVW 

35 NQLLSQKRVGLIHlvn-THL.\E^HQARLL.^LLVlPPArrPGTDQLGM (SEQ ID 
NO:48 1); DGFSLiVirYDYSTAHQPGPNAPLS^-V^CVQVLDPKXKWRTKSSW 
GST (SEQ ID NO:482). Also preferred are polynucleotide fragments encoding these 
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polypepcide fragments. This gene maps to human chromosome 1 1 , and therefore is 
useful in linkage analysis as a marker for chromosome 11. 

This gene is expressed primarily in human T cells and to a lesser extent in 
human colon carcinoma. 

Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s) or cell type(s) present in a 
biological sample and for diagnosis of diseases and conditions, which include, but are 
not limited to, immune disorders and cancer. Similarly, polypeptides and antibodies 
directed to these polypeptides are useful in providing immunological probes for 
differential identification of the tissue(s) or cell type(s). For a number of disorders of 
the above tissues or cells, particularly of the immune and gastrointestinal systems, 
expression of this gene at significantly higher or lower levels may be routinely detected 
m certain tissues (e.g., cancerous and wounded tissues) or bodily fluids (e.g., serum, 
plasma, urine, synovial fluid or spinal fluid) or another tissue or cell sample taken from 
an individual having such a disorder, relative to the standard gene expression level, i.e., 
the expression level in healthy tissue or bodily fluid from an individual not having the 
disorder. Preferred epitopes include those comprising a sequence shown in SEQ ID 
NO: 263 as residues: Leu-21 to Ala-30, Ser-38 to Asp-47, Pro-87 to Asp-94, Leu-197 
to Thr-204, Pro-256 to Ser-262, Thr-277 to Arg-2S2, Thr-293 to Trp-303. 

The tissue distribution indicates that polynucleotides and polypeptides 
corresponding to this gene are useful for the diagnosis and treatment of immune 
disorders and gastrointestinal diseases. 

FEATURES OF PROTEIN ENCODED BY GENE NO: 31 

The translation product of this gene shares sequence homology with Ribosomal 
protein Lll of Caenorhabditis elegans. (See Accession No. 156201.) Preferred 
polypeptide fragments comprise the amino acid sequence: 
ERGVSINQFCKEFNERTKDIKEGIPLPT 

IEKGARQTGKEVAGLVTLKHVYELARlIC^QDEAF.ALQDVPLSSVVRSnGSARSL 
GIRWKDLSSEEI AAF QKERAIFLAAQKEADLAAQEEAAKK (SEQ. ID NO:483). 
Also preferred are polynucleotide fragments encoding these polypeptide fragments. 

This gene is expressed in human embryo tissue and to a lesser extent in human 
epithelioid sarcoma and other tissues. 

Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for idendfication of the dssue(s) or cell type(s) present in a biological sample 
and for diagnosis of diseases and conditions, which include, but are not limited to, 
development disorders and epithelial cell cancer. Similarly, poiypepddes and antibodies 
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directed to these polypeptides are useful in providing immunological probes for 
differential identification of the tissue(s) or cell cype(s). For a number of disorders of 
the above tissues or cells, particularly of the embryonic and epithelial cell systems, 
expression of this gene at significantly higher or lower levels may be routinely detected 
5 in certain tissues (e.g., cancerous and wounded tissues) or bodily fluids (e.g., serum, 
plasma, urine, synovial fluid or spinal fluid) or another tissue or cell sample taken from 
an individual having such a disorder, relative to the standard gene expression level, Le:, 
the expression level in healthy tissue or bodily fluid from an individual not having the 
disorder.. Preferred epitopes include those comprising a sequence shown in SEQ ID 
10 NO: 264 as residues: Lys-34 to Gly-40. 

The tissue distribution indicates that polynucleotides and polypeptides 
corresponding to this gene are useful for the diagnosis and treatment of developmental 
disorders and epithelial cancer. 

15 FEATURES OF PROTEIN ENCODED BY GENE NO: 32 

This gene is expressed primarily in resting T cells. 

Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s) or cell type(s) present in a 
biological sample and for diagnosis of diseases and conditions, which include, but are 

20 not limited to, inflammatory and general immune disorders. Similarly, polypeptides and 
antibodies directed to these polypeptides are useful in providing immunological probes 
for differential identification of the tissue(s) or cell type(s). For a number of disorders 
of the above tissues or cells, particularly of the immune system, expression of this gene 
at significantly higher or lower levels may be routinely detected in certain tissues (e.g., 

25 cancerous and wounded tissues) or bodily fluids (e.g., serum, plasma, urine, synovial 
fluid or spinal fluid) or another tissue or cell sample taken from an individual having 
such a disorder, relative to the standard gene expression level, i.e., the expression level 
in healthy tissue or. bodily fluid from an individual not having the disorder. 

The tissue distribution indicates that polynucleotides and polypeptides 

30 corresponding to this gene are useful for the diagnosis and treatment of disorders of 
immune system. 

FEATURES OF PROTEIN ENCODED BY GENE NO: 33 

This gene is believed to reside on chromosome 1. Accordingly, polynucleotides 
35 derived from this gene are useful in linkage analysis as chromosome 1 markers. 

This gene is expressed primarily in prostate and to a lesser extent in soares adult 

brain, human umbilical vein endothelial cells, and amniotic cells. 
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Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s) or cell cype(s) present in a 
biological sample and for diagnosis of diseases and conditions, which include, but are 
not limited to, prostate-related disorders. Similarly, polypeptides and antibodies 
directed to these polypeptides are useful in providing immunological probes for 
differential identification of the tissue(s) or cell cype(s). For a number of disorders of 
the above tissues or ceils, particularly of the urinary system and nervous system 
expression of this gene at significantly higher or lower levels may be routinely detected 
in certain tissues (e.g., cancerous and wounded tissues) or bodily fluids (e.g., serum, 
plasma, urine, synovial fluid or spinal fluid) or another tissue or ceil sample taken from 
an individual having such a disorder, relative to the standard gene expression level, i.e., 
the expression level in healthy tissue or bodily fluid from an individual not having the 
disorder. 

The tissue distribution indicates that the protein products of this gene are useful 
for the diagnosis and treatment of disorders of the urinary and nervous systems. 

FEATURES OF PROTEIN ENCODED BY GENE NO: 34 

This gene shares sequence homology with R05G6.4 gene product. (See Accession No. 

gil 1326338.) This gene also shares sequence homology with the cyciophilin-like protein 

CyP-60. (See Accession No. 1 199598, see also Biochem. J. 314 (1), 313-319 

(1996).) Preferred polypeptide fragments comprise the amino acid sequence: 

AVYTYHEKKKDTAASGYGT 

YEREATLEmHQKKELARQMKAYEKQRGTRREEQKZLQRAASQDHVRGFLE^ 

SAIVSRPLNPFTAKALSGTSPDDVQPGPSVGPPSKDKDKVLPSFWIPSLTPEAK 

ATKLEKPSRTVTCPMSGKPLRiMSDLTPVHFTPLDSSVDRVGLITRSERYVCAVT 

RDSLSNATPCAVLRPSGAVVTLECVEK^ 

(SEQ ID NO:484); YLYEREAILEYILHQKXEIARQ 

RAASQDHVRGFLE (SEQ ID NO:485); and FTAKALSGTSPDDVQPGPSVGPP 
SKI)KDK\1.PSFWIPSLTPEAKATKLEKPSRTVTCPMSGKPL (SEQ ID NO:486). 
Also preferred are polynucleotide fragments that encode these polypeptide fragments. 
This gene is expressed primarily in human testis and to a lesser extent in other 

tissues. 

Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s) or cell type(s) present in a 
biological sample and for diagnosis of diseases and conditions, which include, but are 
not limited to, male reproductive disorders and in particular testicular cancer. Similarly, 
polypeptides and antibodies directed to these polypeptides are useful in providing 
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immunological probes for differencial identification of the tissue(s) or cell type(s). For 
a number of disorders of the above tissues or cells, particularly of the immune system. 
Expression of this gene at significantly higher or lower levels may be routinely detected 
in certain tissues (e.g., cancerous and wounded tissues) or bodily fluids (e.g., serum, 
plasma, urine, synovial fluid or spinal fluid) or another tissue or cell sample taken from 
an individual having such a disorder, relative to the standard gene expression level, i.e., 
the expression level in healthy tissue or bodily fluid from an individual not having the 
disorder. 

The tissue distribution indicates that polynucleotides and polypeptides 
corresponding to this gene are. useful for diagnosis and treatment of disorders of the 
male reproductive system and in particular of testicular cancer. 

FEATURES OF PROTEIN ENCODED BY GENE NO: 35 

The translation product of this gene shares sequence homology with Lpe5p of 

Saccharomyces cerevisiae which is thought to be important in the metabolism of 

phospholipids. 

This gene is expressed primarily in liver and brain. 

Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s) or cell type(s) present in a 
biological sample and for diagnosis of diseases and conditions, which include, but are 
not limited to, metabolic disorders. Similarly, polypeptides and antibodies directed to 
these polypeptides are useful in providing immunological probes for differential 
identification of the tissue(s) or cell type(s). For a number of disorders of the above, 
dssues or cells, particularly of the metabolic and nervous systems expression of this 
gene at significantly higher or lower levels may be roudnely detected in certain tissues 
(e.g., cancerous and wounded dssues) or bodily fluids (e.g., serum, plasma, urine, 
synovial fluid or spinal fluid) or another dssue or cell sample taken from an individual 
having such a disorder, relative to the standard gene expression level, i.e., the 
expression level in healthy tissue or bodily fluid from an individual not having the 
disorder. Preferred epitopes include those comprising a sequence shown in SEQ ID 
NO: 26S as residues: Pro-14 to Leu-20, Lys-28 to Asn-38, Arg-109 to Arg-1 14, Lys- 

119 to Asn-124, Glu-152 to Leu-157, Pro-172 to Val-180. 

The dssue distribution and homology to LpeSp of Saccharomyces cerevisiae 

indicates that polynucleotides and polypepddes corresponding to this gene are useful for 

the diagnosis and treatment of metabolic and nervous disorders. 
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FEATURES OF PROTEIN ENCODED BY GENE NO: 36 
This gene shares sequence homology with the nuclear ribonucleoprotein U (HNRNP 
U), encoded by C. elegans (See Accession gill 703 576.) Preferred polypeptide 
fragments comprise the amino acid sequence: 
5 MDTSENT^ENDVPEPPMPIAD 

SATKGVPAGNSDTEGGQPGRKRRWGASTATTQKKPSISITTESLKS 
AGQEAVVDLH.\DDSRISEDETERNGDDGTHDKGLKICRTWQVVP.\EGQENGQ 
REEEEEEKEPE AEPP VPPQ VS VE V ALP PP AEHE VKKVTLGDTLTRRS IS QQ KS G V 
SmDDPVRTAQVPSPPRGKISMVfflSNXVRPFTLGQLKELLGRTGTLVEE.\FW 
10 DfGKSHCFVTYSTVEEAVATRTA^ 

LVDRPSETKTEEQGIPRPLHPPPPPPVQPPQHPR.AEQREQERAVREQWAERERE 
MERRERTRSEREWDRDKVREGPRSRSRSRXRRRKERAKSKEKKSEKKEfC^QE 
EPPAKLLDDLFRKTKAAPCIYWLPLTO 
EQKEREKE.A£R£RNRQLEREKRREHSRERI)RERERE 
15 RGRERDRRDTKRHSRSRSRSTPVRDRGGR (SEQ ID NO:488). Also preferred are 
the polynucleotide fragments encoding this polypeptide fragments. 
This gene is expressed primarily in epididymus. 

Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s) or ceil type(s) present in a 

20 biological sample and for diagnosis of diseases and conditions, which include, but are 
not limited to, diseases of the male reproductive system. Similarly, polypeptides and 
antibodies directed to these polypeptides are useful in providing immunological probes 
for differential identification of the tissue(s) or cell type(s). For a number of disorders 
of the above tissues or cells, particularly of the male reproductive system, expression of 

25 this gene at significantly higher or lower levels may be routinely detected in certain 
tissues (e.g., cancerous and wounded tissues) or bodily fluids (e.g., serum, plasma, 
urine, synovial fluid or spinal fluid) or another tissue or cell sample taken from an 
individual having such a disorder, relative to the standard gene expression level, i.e., 
the expression level in healthy tissue or bodily fluid from an individual not having the 

30 disorder. 

The tissue distribution indicates that polynucleotides and polypeptides 
corresponding to this gene are useful for the diagnosis and treatment of male 
reproductive disorders. . 

35 FEATURES OF PROTEIN ENCODED BY GENE NO: 37 
This gene is expressed primarily in amygdala. 
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Thereforerpotynueleocides and polypeptidest-af-ehe-irrvencrorrare useful as 

reagents for differential identification of the tissue(s) or cell type(s) present in a 
biological sample and for diagnosis of diseases and conditions, which include, but are 
not limited to ? inflammatory diseases and reproductive disorders. Similarly, 
5 polypeptides and antibodies directed to these polypeptides are useful in providing 

immunological probes for differentia] identification of the tissue(s) or cell type(s). For 
a number of disorders of the above tissues or cells, particularly of the amygdala, 
expression of this gene at significantly higher or lower levels may be routinely detected 
in certain tissues (e.g., cancerous and wounded tissues) or bodily fluids (e.g., serum, 

10 plasma, urine, synovial fluid or spinal fluid) or another tissue or cell sample taken from 
an individual having such a disorder, relative to the standard gene expression level, i.e., 
the expression leVel in healthy tissue or bodily fluid from an individual not having the 
disorder. 

The tissue distribution indicates that polynucleotides and polypeptides 
15 corresponding to this gene are useful for diagnosis and treatment of inflammatory 
diseases and reproductive disorders. 

FEATURES OF PROTEIN ENCODED BY GENE NO: 38 

This gene shares sequence homology with human opsonin protein P35 

20 fragment. (See Accession No. R94181.) The opsonin protein activates the phagocytosis 
of pathogenic microbes by phagocytic cells. Preferred polypeptide fragments comprise 
the amino acid sequence: GCDSCPPHLPREAFAQDTQAEGECSSRAERADMCPDAP 
PSQEVPEGPGAAP (SEQ ID NO:489). Also preferred are polynucleotide fragments 
encoding these polypeptide fragments. 

25 This gene is expressed in immune-related tissues such as thymus, macrophage, 

T cells and to a lesser extent in many other tissues. 

Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s) or cell type(s) present in.a 
. biological sample and for diagnosis of diseases and conditions, which include, but are 

30 not limited to, immune disorders and infectious disease. Similarly, polypeptides and 
antibodies directed to these polypeptides are useful in providing immunological probes 
for differential identification of the tissue(s) or cell type(s). For a number of disorders 
of the above tissues or cells, particularly of the immune system and infectious disease, 
expression of this gene at significantly higher or lower levels may be routinely detected 

35 in certain tissues (e.g., cancerous and wounded tissues) or bodily fluids (e.g^, serum, 
plasma, urine, synovial fluid or spinal fluid) or another tissue or cell sample taken from 
an individual having such a disorder., relative to the standard gene expression level, i.e.. 
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the expression level in healthy tissue or bodily fluid from an individual not haying the 
disorder. Preferred epitopes include those comprising a sequence shown in SEQ ID 
NO: 271 as residues: Lys-9 to Arg-14, Met-38 to Asp-51. 

The tissue distribution indicates that polynucleotides and polypeptides 
5 corresponding to this gene are useful for diagnosis and treatment of immune disorders, 
as well as the treatment and/or diagnosis of infectious disease. 

FEATURES OF PROTEIN ENCODED BY GENE NO: 39 

The translation product of this gene shares sequence homology with alpha-2 
10 type I collagen which is thought to be important in tissue repair. (See 7 e.g., 21 1607.) 
Preferred polypepdde fragments comprise the amino acid sequence: PQLPSCGRPW 
PGTASVFQSHTQGPREDPDPCRAQGSAGTHCPISLSPPRQ (SEQ ID NO:490). 
Also preferred are the polynucleotide sequences encoding these polypeptide sequences. 
This gene is expressed primarily in the brain and to a lesser extent in the kidney 
15 and thymus 

Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s) or cell type(s) present in a 
biological sample and for diagnosis of diseases and conditions, which include, but are 
not limited to, brain, kidney, and immune disorders. Similarly, polypeptides and 

20 antibodies directed to these polypeptides are useful in providing immunological probes 
for differential identification of the tissue(s) or cell type(s). For a number of disorders 
of the above tissues or cells, particularly of the brain, kidney, and immune disorders, 
expression of this gene at significantly 'higher or lower levels may be routinely detected 
in certain tissues (e.g., cancerous and wounded tissues) or bodily fluids (e.g., serum, 

25 plasma, urine, synovial fluid or spinal fluid) or another tissue or cell sample taken from 
an individual having such a disorder, relative to the standard gene expression level, i.e., 
the expression level in healthy dssue or bodily fluid from an individual not having the 
disorder. 

The dssue distribution and homology to alpha-2 type I collagen indicates that 
30 polynucleotides and polypeptides corresponding to this gene are useful for diagnosis 
and treatment'of tissue repair, and brain, kidney, immune disorders. 

FEATURES OF PROTEIN ENCODED BY GENE NO: 40 

The translation product of this gene shares sequence homology with mini- 
35 ; collagen which is thought to be important in dssue repair tumor metastasis. (See 
Accession No. gnllPIDId 1006976.) Preferred polypepdde fragments comprise the 
amino acid sequence: PGFRGPSGSLGCSFFPRSLGRVLPPGCQRPGAHAD : 
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SSPPPTP (SEQ ID NO:491). Also preferred are polynucleotides encoding this 
polypeptide fragment. 

This gene is expressed in ovarian cancer and to a lesser extent in dedritic cells ■ 
and smooth muscle- 
Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s) or cell type(s) present in a 
biological sample and for diagnosis of diseases and conditions, which include, but are 
not limited to, tumor metastasis and tissue repair. Similarly, polypeptides and 
antibodies directed to these polypeptides are useful in providing immunological probes 
for differential identification of the tissue(s) or cell type(s). For a number of disorders 
of the above tissues or cells, particularly of the tumor metastasis and tissue repair, 
expression of this gene, at significantly higher or lower levels may be routinely detected 
in certain tissues (e.g., cancerous and wounded tissues) or bodily fluids (e.g., serum, 
plasma, urine, synovial fluid or spinal fluid) or another tissue or cell sample taken from 
an individual having such a disorder, relative to the standard gene expression level, i.e., 
the expression level in healthy tissue or bodily fluid from an individual not having the 
disorder. Preferred epitopes include those comprising a sequence shown in SEQ ID 
NO: 273 as residues: Asn-2 to His-1 L 

The tissue distribution and homology to mini-collegen gene indicates that 
polynucleotides and polypeptides corresponding to this gene are useful for diagnosis 
and treatment of tumor metastasis and tissue repair. 

FEATURES OF PROTEIN ENCODED BY GENE NO: 41 

This gene shares sequence homology with the HIV TAT protein. (See 
Accession No. 328416.>Preferred polypeptide fragments comprise the amino acid 
sequence: 

EDLKKPDPASLRAASCGEGKKRKACKNCTCGLAEELEKEK 
SREQMSSQPKSACGNCYLGDAFRCASCPYLGMPAFKPGEKVLLS (SEQ ID 
NO:492): EDLKKPDPASLRAASCGEGKKRKACKNCTCGLAEELEKEK 
SREQMSSQPKSACGNCYLGDAFRCASCPYLGMPAFKPGEKVLLSDSNLHD 
(SEQ ID NO-.493); CGNCYLGDAFRCASCPYLGMPAFKPGEKVLLSDS 
(SEQ ID NO-.494); SCGEGKKRKACKNCTCGLAEELEKE (SEQ ID NO:495); 
SQPKSAC GNCYLGDAFRCASC (SEQ ID NO:496); and REAGQNSERQYVS 
LSRD (SEQ ID NO:497). Also preferred are polynucleotide fragments encoding these 
polypeptide fragments. 

This gene is expressed primarily in the infant brain and to a lesser extent in the • 

breast and testes. 
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Therefore, polynucleotides and polypepddes of the invention are useful as 
reagents for differential identification of the tissue(s) or ceil type(s) present in a 
biological sample and for diagnosis of diseases and conditions, which include, but are 
noc limited to, brain, testes and breast disorders. Similarly, polypeptides and antibodies 
5 directed to these polypeptides are useful in providing immunological probes for 

differential identification of the tissue(s) or cell type(s). For a number of disorders of 
the above tissues or cells, particularly of the brain, testes and breast disorders, 
expression of this gene at significantly higher or lower levels may be routinely detected 
in certain tissues (e.g., cancerous and wounded tissues) or bodily fluids (e.g., serum, 

10 plasma, urine, synovial fluid or spinal fluid) or another tissue or cell sample taken from 
an individual having such a disorder, relative to the standard gene expression level, i.e., 
the expression level in healthy tissue or bodily fluid from, an individual not having the 
disorder. Preferred epitopes include those comprising a sequence shown in SEQ ID 
NO: 274 as residues: Pro-7 to Val-15. 

15 The tissue distribution indicates that polynucleotides and polypeptides 

. corresponding to this gene are useful for diagnosis and treatment of brain, testes and 
breast, and other related disorders. 

FEATURES OF PROTEIN ENCODED BY GENE N O: 42 

20 This gene is expressed primarily in the infant brain, human cerebellum, and to a 

lesser extent in medulloblastoma. 

Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s) or cell type(s) present in a 
biological sample and for diagnosis of diseases and conditions, which include, but are 

25 not limited to, brain related disorders and medulloblastoma and other brain cancers. 
Similarly, polypeptides and antibodies directed to these polypepddes are useful in 
providing immunological probes for differential identification of the tissue(s) or cell 
type(s). For a number of disorders of the above tissues or cells, particularly of the 
brain related disorders and brain cancers, including medulloblastoma, expression of this 

30 gene at significantly higher or lower levels may be routinely detected in certain tissues 
(e.g., cancerous and wounded tissues) or bodily fluids (e.g., serum, plasma, urine, 
synovial fluid or spinal fluid) or another dssue or cell sample taken from an individual 
having such a disorder, relative to the standard gene expression level,, i.e., the 
expression level in healthy tissue or bodily fluid from an individual not having the 

35 disorder. Preferred epitopes include those comprising a sequence shown in SEQ ID 
NO: 275 as residues: Thr-41 to Glu-47. 
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FEATURES OF PROTEIN ENCODED BY GENE NO: 44 

This gene is expressed primarily in the fetal brain, cerebellum and to a lesser 
extent in the placenta. 

Therefore, polynucleotides and polypepudes of the invention are useful as 
reagents, for differential identification of the tissue(s) or cell type(s) present in a 
biological sample and for diagnosis of diseases and conditions, which include, but are 
not limited to, neuronal cell related disorders. Similarly, polypepudes and antibodies 
directed to these polypepudes are useful in providing immunological probes for 
differential identification of the tissue(s) or cell type(s)." For a number of disorders of 
the above tissues or cells, particularly of the neuronal cell related disorders, expression 
of this gene at significantly higher or lower levels may be routinely detected in certain 
tissues (e.g., cancerous and wounded tissues) or bodily fluids (e.g., serum, plasma, 
urine, synovial fluid or spinal fluid) br another tissue or cell sample, taken from an 
individual having such a disorder, relative to the standard gene expression level, i.e., 
the expression level in healthy tissue or bodily fluid from an individual not having the 
disorder. Preferred epitopes include those comprising a sequence shown in SEQ ID 
NO: 277 as residues: Thr-20 to Gly-28. 

The tissue distribution and homology to proline -rich protein genes indicates that 
polynucleotides and polypeptides corresponding to this gene are useful for diagnosis 
and treatment of neuronal cell related disorders. 

FEATURES OF PROTEIN ENCODED BY GENE NO: 45 

The translation product of this gene shares sequence homology with 
precerebellin of human, which is thought to be important in synaptic physiology. (See 
Accession No. gill 8025 1.) It has been observed that cerebeliin-like immunoreactivity is 
associated with Purkinje cell postsynaptic structures. Thus, it is likely that this gene 
also have synaptic activity. Preferred polypeptide fragments comprise the amino acid 
sequence: QEGSEPVLLEGECLVVCEPGRAAAGGPGGAALGEAPPGRVAFXAV 
RSHHHEPAGETGNGTSGAIYFDQVLVNEGGGFDRASGSFVAPVRGVYSFRFH 
VVKVYNRQTVQVSLMLNTWPVISAFAM 

LRRGXSTGW (SEQ ID NO:499). Also preferred are polynucleotide fragments 
encoding these polypeptide fragments. 

This gene is expressed primarily in cerebellum and infant brain. By Northern 
analysis, a single transcript of 2.4 kb was observed in brain tissues. 

Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s) or cell type(s) present in a 
biological sample and for diagnosis of diseases and conditions, which include/but are 
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The tissue distribution indicaces that polynucleotides and polypeptides . 
corresponding to this gene are useful for diagnosis and treatment of human brain related 
disorders, brain cancers, and medullobiastorna. 

FEATURES OF PROTEIN ENCODED BY GENE NO: 43 

The translation product of this gene shares sequence homology with a 
phosphotyrosine-independent ligand for the lck SH2 domain which is thought to be 
important in signal transduction related to phosphotyrosine-independent ligand for the 
lck SH2 domain. (See Accession No. gill 184951.) Preferred polypeptide fragments 
comprise the amino acid sequence: ESSGQARTLADPGPGWPRQQGMCFGSLT 
GLSTTPHGFLTVSAEADPRLESLSQMLSM 

DTIQYSKH (SEQ ID NO:498). Also preferred are polynucleotide fragments encoding 
this polypeptide fragment. It is likely that this gene is a new member of a family of 
phosphotyrosine-independent ligands for the lck SH2 domains. 

This gene is expressed primarily in the placenta and to a lesser extent in 
endothelial cells and neutrophil. 

Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s) or cell type(s) present in a 
biological sample and for diagnosis of diseases and conditions, which include, but are 
not limited to, reproductive, cardiovascular, immune, and infectious diseases. 
Similarly, polypeptides and antibodies directed to these polypeptides are useful in 
providing immunological probes for differential identification of the tissue(s) or cell 
type(s). For a number of disorders of the above tissues or cells, particularly of the 
cardiovascular, reproductive, and immune system, and infectious diseases, expression 
of this gene at significantly higher or lower levels may be routinely detected in certain 
tissues (e.g., cancerous and wounded tissues) or bodily fluids (e.g., serum, plasma, 
urine, synovial fluid or spinal fluid) or another tissue or cell sample taken from an 
individual having such a disorder, relative to the standard gene expression level, i.e., 
the expression level in healthy tissue or bodily fluid from an individual not having the 
disorder. 

The tissue distribution and homology to a phosphotyrosine-independent ligand 
for the lck SH2 domain indicates that polynucleotides and polypeptides corresponding 
to this gene are useful for diagnosis and treatment of cardiovascular, reproductive, and 
immune system diseases, as well as infectious diseases. 
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FEATURES OF PROTEIN ENCODED BY* GENE NO: 47 

The translation product of this gene shares sequence homology with a 12 kD 
nucleic acid binding protein of Feline calcivirus which is thought to be important in viral 
replication. (See Accession No. 59264) 

This gene is expressed primarily in human cardiomyopathy and to a lesser 
extent in T helper cells, fetal brain and synovial sarcoma. 

Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s) or cell type(s) present in a 
biological sample and for diagnosis of diseases and conditions, which include, but are 
not limited to, cardiomyopathy as well as viral infection. Similarly, polypeptides and 
antibodies directed to these polypeptides are useful in providing immunological probes 
for differential identification of the tissue(s) orcell type(s). For a number of disorders 
of the above tissues or cells, particularly of the cardiovascular system/expression of 
this gene at significantly higher or lower levels may be routinely detected in certain 
tissues (e.g., cancerous and wounded tissues) or bodily fluids (e.g., serum, plasma, 
urine, synovial fluid or spinal fluid) or another tissue or cell sample taken from an 
individual having such a disorder, celative to the standard gene expression level, i.e., 
the expression level in healthy tissue or bodily fluid from an individual not having the 
disorder Preferred epitopes include those comprising a sequence shown in SEQ ID 
NO: 280 as residues: Trp-20 to Cys-26. 

The tissue distribution in cardiomyopathy and homology to viral 12 kD nucleic 
acid binding protein indicates that polynucleotides and polypeptides corresponding to 
this gene are useful for diagnosis and intervention of cardiomyopathy, including those 
caused by ischemic, hypertensive, congenital, valvular, or pericardial abnormalities. 
The gene expression pattern may be the consequence or the cause for these conditions. 

FEATURES OF PROTEIN ENCODED BY GENE NO: 48 

The translation product of this gene shares sequence homology with tumor 
necrosis factor related gene product which is thought to be important in tumor necrosis, 
bacterial and viral infection, immune diseases and immunoreactions. 

This gene is expressed primarily in colon and to a lesser extent in ovarian and 
breast cancers. 

Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the ussue(s) or cell type(s) present in a 
biological sample and for diagnosis of diseases and conditions, which include, but are 
not limited to, tumors of colon, ovary or breast origins. Similarly, polypeptides and 
antibodies directed to these polypeptides are useful in providing immunological probes 
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noc limited to, neuronal cell signal transduction and synaptic physiology. Similarly; 
polypeptides and antibodies directed to these polypeptides are useful in providing 
immunological probes for differential identification of the tissue(s) or cell type(s). For 
a number of disorders of the above tissues or cells, particularly of the neuronal cell 
5 signal transduction and synaptic physiology expression of this gene at significantly 
higher or lower leveis may be routinely detected in certain tissues (e.g., cancerous and 
wounded tissues) or bodily fluids (e.g., serum, plasma, urine, synovial fluid or spinal 
fluid) or another tissue or cell sample taken, from an individual having such a disorder, 
relative to the standard gene expression level, i.e., the expression level in healthy tissue 
10 or bodily fluid from an individual not having the disorder. 

The tissue distribution and homology to gene or gene family indicates that 
polynucleotides and polypeptides corresponding to this gene are useful for diagnosis 
and treatment of neuronal cell related disorders. 

15 FEATURES OF PROTEIN ENCODED BY GENE NO: 46 

This gene is expressed in fetal liver and-spieen, and to a lesser extent in bone 
marrow, umbilical vein, and T cells:' 

Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s) or cell type(s) present in a 

20 biological sample and for diagnosis of diseases and conditions, which include, but are 
not limited to, disorders of the immune system, particularly hematopoiesis. Similarly, 
polypeptides and antibodies directed to these polypeptides are useful in providing 
immunological probes for differential identification of the tissue(s) or ceil type(s). For 
a number of disorders of the above tissues or cells, particularly of the hematopoiesis 

25 and immune disorders, expression of this gene ac significantly higher or lower leveis 
may be routinely detected in certain tissues (e.g., cancerous and wounded tissues) or 
bodily fluids (e.g., serum, plasma, urine, synovial fluid or spinal fluid) or another 
tissue or cell sample taken from an individual having such a disorder, relative to the 
standard gene expression level, i.e., the expression level in healthy tissue or bodily 

30 fluid from an individual not having the disorder. Preferred epitopes include those 
comprising a sequence shown in SEQ ID NO: 279 as residues: Asp-30 to Glu-57.. 

The tissue distribution indicates that polynucleotides and polypeptides 
corresponding to this gene are useful for diagnosis and treatment of hematopieotic and 
immune disorders. 



35 



WO 9S/54963 



PCT/US9S/11422 



for differencial identification of che tissue(s) or cell type(s). For a number of disorders 
of the above tissues or cells, particularly of the colon, ovary and breast, expression of 
this gene at significantly higher or lower levels may be routinely tietected-in certain 
tissues (e.g., cancerous and wounded tissues) or bodily fluids (e.g.. serum, plasma. 
5 urine, synovial fluid or spinal fluid) or another tissue or cell sample taken from an 
individual having such a disorder, relative to the standard gene expression level, i.e.. 
the expression level in healthy tissue or bodily fluid from an individual not having the 
disorder. 

The tissue distribution and homology to Tumor necrosis factors indicates that 
10 polynucleotides and polypeptides corresponding to this gene are useful for intervention 
of cancers of colon, ovary and breast origins, because TNF family members are known 
to be involved in the tumor development. 

FEATURES OF PROTEIN ENCODED BY GENE NO: 49 

15 The translation product of this gene shares sequence homology with mucins, 

such as epithelial mucin, which is thought to be important in extracellular matrix 
functions such as protection, lubrication and cell adhesion (See for example Accession 
No. R63002). Preferred polypeptide fragments comprise the following amino acid 
sequence: PRSRPALRPGRQRPPSHS ATSGVLRPRKKPDP (SEQ ID NO:500). 

20 Also preferred are polynucleotide fragments encoding these polypeptide fragments. 

Moreover, this gene maps to chromosome 22ql 1.2-qter, and therefore, can be used as 
a marker in linkage analysis for chromosome 22. 

This gene is expressed primarily in corpus colosum.- 

Therefore, polynucleotides and polypeptides of the invention are useful as 

25 reagents for differential identification of the tissue(s) or cell type(s) present in a 

biological sample and for diagnosis of diseases and conditions, which include, but are 
not limited to, tumors, especially of corpus colosum, as well as metastatic lesions. 
Similarly, polypeptides and antibodies directed to these polypeptides are useful in 
providing immunological probes for differential identification of the tissue(s) or cell 

30 type(s). For a number of disorders of the above tissues or cells, particularly of the 

corpus colosum and other solid tissues, expression of this gene at significantly higher 
or lower levels may be routinely detected in certain tissues (e.g., cancerous and 
wounded tissues) or bodily fluids (e.g., serum, plasma, urine, synovial fluid or spinal 
fluid) or another tissue or cell, sample taken, from an individual having such a disorder, 

35 relative to the standard gene expression level, i.e., the expression level in healthy tissue 
or bodily fluid from an individual not having the disorder. 
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The tissue distribution and homology to mucins indicates that polynucleotides 
and polypeptides corresponding to this gene are useful for serum tumor markers or 
immunotherapy targets because tumor cells have greatly elevated level of mucin 
expression and shed the molecules into the epithelial tissues. 

5 

FEATURES OF PROTEIN ENCODED BY GENE NO: 50 

This gene is expressed primarily in CD34 depleted buffy coat cord blood and 
primary dendritic cells. 

Therefore, polynucleotides and polypeptides of the invention are useful as 

10 reagents for differential identification of the tissue(s) or cell type(s) present in a 

biological sample and for diagnosis of diseases and conditions, which include, but are 
not limited to, hematopoietic disorders and immunological disorders. Similarly, 
polypeptides and antibodies directed to these polypeptides are useful in providing 
immunological probes for differential identification of the tissue(s) or cell type(s). For 

15 a number of disorders of the above tissues or cells, particularly of the hematopoietic and 
immune systems, expression of this gene at significantly higher or lower levels may be 
routinely detected in certain tissues (e.g., cancerous and wounded tissues) or bodily 
fluids (e.g., serum, plasma, urine, synovial fluid or spinal fluid) or another tissue or 
cell sample taken from an individual having such a disorder, relative to the standard 

20 gene expression level, i.e., the expression level in healthy tissue or bodily fluid from an 
individual not having the disorder. 

The tissue distribution in CD34 depleted buffy coat cord blood and primary 
dendritic ceils indicates that polynucleotides and polypeptides corresponding to this 
gene are useful for diagnosis and treatment of hematopoietic and. immune disorders. 

25 Secreted or cell surface proteins in the above tissue distribution often are involved in 
cell activation (e.g. cytokines) or molecules involved in cell surface activation. 

FEATURES OF PROTEIN ENCODED BY GENE NO: 51 

The- translation product of this gene shares sequence homology with Interferon 
30 induced 1-8 gene encoded polypeptide which is thought to be important in binding to 
retroviral rev responsive element. Preferred polypeptide fragment comprise the 
following amino acid sequences: MTLITPS XKLTFXKGNKS WSS R ACSSTL VD P 
(SEQ ID NO:501). Also preferred are polynucleotide fragments encoding these 
polypeptide fragments. 
35 This gene is expressed primarily in CD34 positive cells and neutrophils. 

Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s) or cell type(s) present in a" 
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biological sample and for diagnosis of diseases and conditions, which include, but are 
not limited to, retroviral infection, such as AIDS, and other immune disorders. 
Similarly, polypeptides and antibodies directed to these polypeptides are useful in 
providing immunological probes for differential identification of the tissue(s) or cell 
type(s). For a number of disorders of the above tissues or cells, particularly of the 
immune system, expression of this gene at significantly higher or lower levels may be 
routinely detected in certain tissues (e.g., cancerous and wounded tissues) or bodily 
fluids (e.g., serum, plasma, urine, synovial fluid or spinal fluid) or another tissue or 
cell sample taken from an individual having such, a disorder, relative to the standard 
gene expression level, i.e., the expression level in healthy tissue or bodily fluid from an 
individual not having the disorder. Preferred epitopes include those comprising a 
sequence shown in SEQ ID NO: 284 as residues: GIn-51 to Trp-62. 

The tissue distribution and homology to interferon induced gene 1-8 indicates 
that polynucleotides and polypeptides corresponding to this gene are useful for 
intervention of retroviral infection including HIV. The factor may be involved in viral 
stability or viral entry into the cells. Alternatively, the virus/factor complex may elicit 
the cellular immune reaction. 

FEATURES OF PROTEIN ENCODED BY GENE NO: 52 

This gene shares sequence homology to immunoglobulin lambda chain (See 
Accession No. 28654S4). Therefore it is likely that this gene has activity similar to an 
immunoglobulin lambda chain. Preferred polypeptide. fragments comprise the following 
amino acid sequence: GHPSPALSIAPSDGSQLPCDEVPYGEAHVTRYCKKPLTNS 
HLETEAQSSSL (SEQ ID NO:502). Also preferred are polynucleotide fragments 
encoding these polypeptide fragments. 

This gene is expressed primarily in Hodgkin's lymphoma. 

Therefore, polynucleotides and polypepudes of the invention are useful as 
reagents for differential identification of the tissue(s) or cell type(s) present in a 
biological sample and for diagnosis of diseases and conditions, which include, but are 
not limited to, Hodgkin's lymphoma and other immune disorders. Similarly, 
polypeptides and andbodies directed to these polypepudes are useful in providing 
immunological probes for differential identification of the tissue(s) or cell type(s). For 
a number of disorders of the above tissues or cells, particularly of the immune system, 
f expression of this gene at significantly higher or lower levels may be routinely detected 
in certain tissues (e.g., cancerous and wounded tissues) or bodily fluids (e.g., serum, 
plasma, urine, synovial fluid or spinal fluid) or another tissue or cell sample taken from 
an individual having such a disorder, relative to the standard gene expression leyel, i.e. 
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the expression level in healthy tissue or bodily fluid from an individual not haying the 
disorder.. Preferred epitopes include those comprising a sequence shown in SEQ ID 
NO: 285 as residues: Pro-27 to Thro 2. 

The tissue distribution in Hodgkin's lymphoma and the sequence homology 
5 indicates that polynucleotides and polypeptides corresponding to this gene are useful for 
diagnosis of Hodgkin's lymphoma, since the elevated expression and secretion by the 
tumor mass may be indicative of tumors of this type. Additionally the gene product may 
be used as a target in the immunotherapy of the cancer.Because the gene is expressed in 
cells of lymphoid origin, the natural gene product may be involved in immune 
10 functions. Therefore it may be also used as an agent for immunological disorders 

including arthritis, asthma, immune deficiency diseases such as AIDS, and leukemia. 

FEATURES OF PROTEIN ENCODED BY GENE NO: 53 

This gene has extensive homology to cDNA for Homo sapiens mRNA for the 

15 ISLR gene(See Accession No. AB003 184). This protein is considered to be a new- 
member of the Ig superfamily and contains a leucine-rich repeat (LRR) with conserved 
flanking sequences and a C2-type immunoglobulin (Ig)-like domain. These domains are 
important for protein-protein interaction or cell adhesion, and therefore it is possible that 
the novel protein ISLR may also interact with other proteins or cells. The ISLR gene 

20 was mapped on human chromosome 15q23-q24 by fluorescence in situ hybridization 
(See Medline Article No. 97468140). Homology to the ISLR gene has been confirmed 
by another independent group as well (See Accession No. Hs. 102 171) 

This gene is expressed in a number of tissues including human retina, heart, . 
skeletal muscle, prostate, ovary, small intestine, thyroid, adrenal cortex, testis, 

25 stomach, spinal cord, fetal lung and fetal kidney tissues, colon, tonsil and stomach 

cancer, and to a lesser extent in endometrial stromal cells treated with estradiol, breast 
tissue, synovium, lymphoma, and number of other tumors. 

Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s) or cell type(s) present in a 

30 biological sample and for diagnosis of diseases and conditions, which include, but are 
not limited to, tumors of colon, ovary and breast origins. However, due to the wide 
range of expression in various tissues, protein may play a vital role in the development 
of cancer in other tissues as well, not just those mentioned above. Similarly, 
polypeptides and antibodies directed to these polypeptides are useful in providing 

35 immunological probes for differential identification of the tissue(s) or cell type(s). For 
a number of disorders of the above tissues or cells, particularly of the colon, ovary and 
breast, expression of this gene at significantly higher or lower levels may be routinely 
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detected in certain tissues (e.g., cancerous and wounded tissues) or bodily fluids (e.g., 
serum, plasma, urine, synovial fluid or spinal fluid) or another tissue or cell sample 
taken from an individual having such a disorder, relative to the standard gene 
expression level, i.e., the expression level in healthy tissue or bodily fluid from an 
individual not having the disorder. Additionally, this gene maps to chromosome 15q23- 
q24, and therefore, can be used as a marker in linkage analysis for chromosome 15. 

The tissue distribution in tumors of colon, ovary, and breast origins indicates 
that polynucleotides and polypeptides corresponding to this gene are useful for 
diagnosis and intervention of these tumors, in addition to other tumors where 
expression has been indicated^Protein, as well as, antibodies directed against the 
protein may show utility as a tumor marker and/or immunotherapy targets for the above 
listed tumors and tissues. 

FEATURES OF PROTEIN ENCODED BY GENE NO: 54 

Gene has homology to multidrug resistance gene 1 (See Accession No. 
P06795). Preferred polynucleotide fragments comprise the following sequence: 
GCrTCGTGTCCAACCCTCTTGCCCTTCGCCTGTGTGCCTGGAGCCAGTCCCA 
CCACGCTCGCGTTTCCTCCTGTAGTGCTCACAGGTCCCAGCACCGATGGCA 
TTCCCTTfGCCCTGAGTCTGCAGCGGGTCCCTTTTGTGCTTCCTTCCCCTCA 
GGTAGCCTCTCTCCCCCTGGGCCACTCCCGGGGGTGAGGGGGTTACCCCTT 
CCC AGTG ITTTTI ATTCCTGTGGGGCTC ACCCC A AAGT ATT AAAAGT AGCTTT 
GTAA (SEQ ID NO.503). Also prefeired are polypeptide fragments encoded by these 
polynucleotide fragments. 

This gene is expressed primarily in lung, esophagus, leukemia (Jurkat cells) and 
breast cancers and to a lesser extent in macrophages treated with GM-CSF fetal tissues 
and wide range of tissues. 

Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s) or cell type(s) present in a 
biological sample and for diagnosis of diseases and conditions which include, but are 
not limited to, cancer of wide range of origins. Similarly, polypeptides and antibodies 
directed to these polypeptides are useful in providing immunological probes for 
differential identification of the tissue(s) or cell type(s). For a number of disorders of 
the above tissues or cells, particularly of the solid tumors, lung and leukemia, 
expression of this gene at significantly higher or lower levels may be routinely detected 
in certain tissues (e.g., cancerous and wounded tissues) or bodily fluids (e.g., serum, 
plasma, urine, synovial fluid or spinal fluid) or another dssue or cell sample taken from 
an individual having such a disorder, relative to the standard gene expression level, i.e.. 



WO 98/54963 



PCT/US98/11422 



the expression level in healthy tissue or bodily fluid from an individual not haying the 
disorder. Furthermore, due to the high expression level in lung tissue and the proposed 
function of the multidrug resistance protein 1 gene as the efflux pump responsible for 
low-drug accumulation in multidrug-resistant cells, protein as well mutants thereof, 

5 may also be beneficial as a target for gene therapy, particularly for the chronic patient. 
Preferred epitopes include those comprising a sequence shown in SEQ ID NO: 287 as 
residues: Met-1 to Lys-16. 

The tissue distribution in wide range of cancers and fetal tissues indicates that 
polynucleotides and polypeptides corresponding to this gene are useful for detection of 

10 cells in active proliferation, such as cancers. The gene products may be used for cancer 
markers or immunotherapy target. 

FEATURES OF PROTEIN ENCODED BY GENE NO: 55 
This gene maps to the X chromosome. 
15 . This gene is expressed primarily in the brain and to a lesser extent in the 

developing embryo. 

Therefore, polynucleotides and polypeptides of the invention are useful as 

reagents for differential identification of the tissue(s) or cell type(s) present in a 
biological sample and for diagnosis of diseases and conditions, which include, but are 

20 not limited to, neurodegenerative disease states and developmental disorders. Similarly, 
polypeptides and antibodies directed to these polypeptides are useful in providing 
> immunological probes for differential identification of the tissue(s) or cell type(s). For 
a number of disorders, including sex-linked disorders, of the above tissues or cells, 
particularly of the neurological, developmental systems, and cardiovascular system, 

25 expression of this gene at significantly higher or lower levels may be routinely detected 
in certain tissues (e.g., cancerous and wounded tissues) or bodily fluids (e.g., serum, 
plasma, urine, synovial fluid or spinal fluid) or another tissue or cell sample taken from 
an individual having such a disorder, relative to the standard gene expression level, i.e., 
the expression level in healthy tissue or bodily fluid from an individual not having the 

30 disorder. Moreover, this gene maps to the X chromosome, and therefore, may be used 
as a marker in linkage analysis for this chromosome. 

The tissue distribution indicates that polynucleotides and polypeptides- 
corresponding to this gene are useful for the detection/treatment of neurodegenerative 
disease states and behavioral disorders such as Alzheimer's Disease, Parkinson's 

35 Disease, Huntington's Disease, Klinefelter's, schizophrenia, mania, dementia, 

paranoia, obsessive compulsive disorder and panic disorder. In addition, the gene or 
gene product may also play a role in the treatment and/or detection of developmental 
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disorders associated with the developing embryo, sexually-linked disorders, or 
disorders of the cardiovascular system. 

FEATURES OF PROTEIN ENCODED BY GENE NO: 56 
5 The translation product of this gene shares sequence homology with paxillin 

which is thought to be important in mediating signal transduction from growth factor 
receptors to the cytoskeleton. Preferred polynucleotide fragments, comprise the 
following sequence: TGGCTCACTGTCTTACAATCACTGCTGTGGAATCATGA 
TACCACTi 1 i AGCTCTTTGC ATCTTCCTTC AGTGT A 1 ' I ' 1 i TG 1 1 1TTCAAGAGG 
10 AAGTAGATTTTAACTGGACAACTTTGAGTACT 

GGGTTGTGGTTTCAA (SEQ ID NO:506). .Also preferred are polypeptide fragments 
encoded by these polynucleotide fragments. More preferably, polypeptide fragments 
comprise the amino acid sequence: LDELMAHLTEMQAKVAVRAD 
AGKKHLPDKQDHK.^SLDSiVELGGLEQELQDLGlATVPKGHCASCQKPL\GKVI 
15 HALGQSWHPEHFVCTHCKEEIGSSPFFERSGLXYCPNDYHQLFSPRCAYCAAP 
ILDKVLTAMNQTWHPEHFFCSHCGEVFG.AEGFHEKDKKPYCRKDFLA.VIFSPK 
CGGCNRPVLENYLSAMDTVV/'HPECFVCGDCFTSFSTGSFFELDGRPFCELHYH 
HRRGTLCHGCGQPITGRCISAiVlGYKFHPEHFVCAFCLTQLSKGIFREQNDKTY 
CQPCFNKLF (SEQ ID NO:507); KASLDSMLGGLEQELQDLGIATVPKGHC 
20 ASCQKPIAGKV1HAL (SEQ ID NO:508); CPNDYHQLFSPRCAYCAAPILDKVL 
TAMNQTWHPEHFFCSHCGEVFGAEG (SEQ ID NO:509); DKKPYCRKDFLAM 
FSPKCGGCNRPVLENYLSAMDTVWHPECFVCGDCFTSFSTGSFFELDGRPFCE 
L (SEQ ID NO:5 10); CGQPITGRCIS AMGYKFHPEHFVCAFCLTQLSKGIFRE 
QNDKTYCQ (SEQ ID NO:51 1). Polynucleotide fragments encoding these preferred 
25 polypeptide fragments are also contemplated. 

This gene is expressed primarily in brain, and to a lesser extent in the 
developing embryo; 

Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s) or cell type(s) present in a 
30 biological sample and for diagnosis of diseases and conditions, which include, but are 
not limited to, neurological disease states and developmental abnormalities. Similarly, 
polypeptides and antibodies directed to these polypeptides are useful in providing 
immunological probes for differential identification of the tissue(s) or cell type(s). For 
a number of disorders of the above tissues or cells, particularly of the immune and 
35 nervous systems, expression of this gene at significantly higher or lower levels may be 
routinely detected in certain tissues (e.g., cancerous and wounded tissues) or bodily 
fluids (e.g., serum, plasma, urine, synovial fluid or spinal fluid) or another tissue or 
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cell sample taken from an individual having such a disorder, relative to the standard 
gene expression level, i.e., che expression level in healthy tissue or bodily fluid from an 
individual. not having the disorder. Moreover, since this gene shares homology with a 
gene that maps to chromosome 11, (See Accession No.T87404), gene as well as its 
5 translated product may be used for linkage analysis on chromosome 1 1 . 

The tissue distribution and homology to paxillin indicates that polynucleotides 
and polypeptides corresponding to this gene are useful for the treatment and or detection 
of disease states associated with abnormal signal transduction in brain and/or the 
developing embryo. This would include treatment or detection of neurodegenerative 
10 disease states and behavioral disorders such as Alzheimer's Disease, Parkinson's 

Disease, Huntingtons Disease, schizophrenia, mania, dementia, paranoia, obsessive 
compulsive disorder and panic disorder and also in the treatment and or detection of 
embryonic development defects. 



15 FEATURES OF PROTEIN ENCODED BY GENE NO: 57 

This gene is expressed primarily in fetal spleen, brain, and to a lesser extent in 
six week old embryo. 

. Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s) or cell type(s) present in a 

20 biological sample and for diagnosis of diseases and conditions, which include, but are 
not limited to, immune disorders, neurological disorders, and developmental 
abnormalities. Similarly, polypeptides and antibodies directed to these polypeptides are 
useful in providing immunological probes for differential- identification of the tissue(s) 
or cell type(s). For a number of disorders of the above tissues or cells, particularly of 

25 the immune and developmental systems, expression of this gene at significandy higher 
or lower levels may be routinely detected in certain tissues (e.g., cancerous and 
wounded tissues) or bodily fluids (e.g., serum, plasma, urine, synovial fluid or spinal 
fluid) or another tissue or ceil sample taken from an individual having such a disorder, 
relative to the standard gene expression level, i.e., the expression level in healthy tissue 

30 or bodily fluid from an individual not having the disorder. Preferred epitopes include 

those comprising a sequence shown in SEQ ID NO: 290 as residues: Arg-28 to Gly-34. 

The expression of this gene in fetal spleen indicates that polynucleoddes and 
polypeptides corresponding to this gene are useful for treatment/detection of immune 
disorders such as arthritis, asthma, immune deficiency diseases such as AIDS, and 

35 leukemia. In addition the expression of this gene in the early embryo, indicates a key 
role in embryo development and hence the gene or gene product could be used in the 
treatment and or detection of embryonic development defects. This would include 
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treatment or detection of neurodegenerative disease states and behavioral disorders such 
. as Alzheimer's Disease, Parkinson's Disease, Huntintons Disease, schizophrenia, 
- mania, dementia, paranoia, obsessive compulsive disorder and panic disorder and also 
in the treatment and or detection of embryonic development defects. 

5 

FEATURES OF PROTEIN ENCODED BY GENE NO: 58 
The translation product of this gene shares sequence homology with the gene disrupted 
in the neurodegenerative disease dentatorubal-pallidoluysian atrophy. Moreover a long 
open reading fame exists in an alternative frame. Preferred polypeptide fragments 

10 comprise the following: 

MGSSQS VEIPGGGTEG YH VLR V QENS PGHR AGLEPFFDFI VS INGS RLNKDND 
TLKXJLLKXN^EKPV^ILIYSSKTLELRETSVTPSNLWGGQGLLGVSIRFCSFD 
G^NENV^-HVLEVESNSPAALAGLRPHSDYIIGADTVMNESEDLFSLIETHEAKP 
LKLYVYNTDTDNCREVnTPNSAWGGEGSLGCGIGYGYLHRIPTRPFEEGKKIS 

1 5 LPGQMAGTPITPLKDGFTEYQLSS VNPPSLSPPGTTGIEQSLTGLSISSTPPA VSS 
VLSTGVPTVPLLPPQVNQSLTSVPPMNPATTLPGLMPLPAGLPNLPNLNLNLPA 
PHIMPGVGLPELVNPGLPPLPSMPPRNLPGIAPLPLPSEFLPSFPLVPESSSAASS 
GELLSSLPPTSNAPSDPATTTAK.ADA^SSLTVDVTPPTAK.APTTVEDRVGDSTPV 
SEKPVSAAVDANASESP (SEQ ID NO:5l2); SVEIPGGGTEGYHVLRVQENSPGH 

20 RAGLEPFFDHVSINGSRLNKDNDTLKDLLKXNVEKPVK.MLIYSSKTLELRETS 
VTPSNLWGGQGLLGVSIRFCSFDGANENVWH (SEQ ID NO:5 13); ESNSPA<\ 
LAGLRPHSDYUGADTV^lNESEDLFSLIETHEAKPLKlYVYNTDTDNCREVnTP 
NSAWGGEGSLGCGIGYGYLHRIPTRPFEEGKKISLPGQMAGTPITPLKDGFTEV 
QLSSVNPPSLSPPGTTGIEQSLTG LSISS (SEQ ID NO:514); RIPTRPFEEGKKI 

25 SLPGQMAGTPiTPLKDGFTEVQLSSVNPPSLSPPGTTGIEQSLTGLSISSTPPAVS 
SVLSTGVPTVPLLPPQVNQSLTSVPPMNPATTLPGLMPLPAGLPNLPNLNLNLP 
APHIMPGVGLPELVNPGLPPLPSMPPRN (SEQ ID NO:5l6); PGLPPLPSMPPRN 
LPGIAPLPLPSEFEPSFPLVPESSSAASSGELLSSLPPTSNAPSDPATTTAKADAA 
SSLTVDVTPPTAJCAFTTVEDRVGDSTPVSEKPYSA'WDAN (SEQ ID NO:5 17). 

30 This gene is expressed primarily in prostate cancer, and to a lesser, extent in the 

pineal glands and in fetal lung. 

Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s) or cell type(s) present in a 
biological sample and for diagnosis of diseases and conditions, which include, but are 

35 not- limited to, neurological conditions and pulmonary disorders. Similarly, 

polypeptides and antibodies directed to these polypeptides are useful in providing 
immunological probes for differential identification of the tissue(s) or cell type(a). For 
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a number of disorders of the above tissues or ceils, particularly of the nervous, 
pulmonary, and endocrine systems, expression of this gene at significantly higher or 
lower levels may be routinely detected in certain tissues (e.g., cancerous and wounded 
tissues) or bodily fluids (e.g., serum, plasma, urine, synovial fluid or spinal fluid) or 

5 another tissue or cell sample taken from an individual having such a disorder, relative to 
the standard gene expression level, i.e., the expression level in healthy tissue or bodily 
fluid from an individual not having the disorder. Preferred epitopes include those 
comprising a sequence shown in SEQ ED NO: 29 1 as residues: Asn-9 to Leu- 14. 
The abundance of this gene in the pineal gland and its homology to a gene 

10 disrupted in the neurodegenerative disease state Dentatorubral-pallidoluysian atrophy 
indicates that this gene may be useful in the treatment and/or detection of other 
neurodegenerative disease states and behavioral disorders such as Alzheimer's Disease, 
Parkinson's Disease, Hunungtons Disease, schizophrenia, mania, dementia, paranoia, 
obsessive compulsive disorder and panic disorder. The abundance of this gene in fetal 

15 lung would suggest that misreguiation of the expression of this protein product in the 
adult could lead to lymphoma or sarcoma formation, particularly in the lung; that it may 
also be involved in predisposition to certain pulmonary defects such as pulmonary 
edema and embolism, bronchitis and cystic fibrosis; and thus the gen or the gene 
protein encoded by the gene could be used in the detection and/or treatment of these 

20 pulmonary disorders. 

FEATURES OF PROTEIN ENCODED BY GENE NO: 59 
This gene is expressed primarily in the developing embryo. 
Therefore, polynucleotides and polypeptides of the invention are useful as 

25 reagents for differential identification of the tissue(s) or cell type(s) present' in a 

biological sample and for diagnosis of diseases and conditions, which include, but are 
not limited to, developmental abnormalities. Similarly, polypeptides and antibodies 
directed to these polypeptides are useful in providing immunological probes for 
differential identification of the tissue(s) or cell type(s). For a number of disorders of 

30 the above tissues or cells, particularly of the developmental system, expression of this 
gene at significantly higher or lower levels may be routinely detected in certain tissues 
(e.g., cancerous and wounded tissues) or bodily fluids (e.g., serum, plasma, urine, 
synovial fluid or spinal fluid) or another tissue or cell sample taken from an individual 
having such a disorder, relative to the standard gene expression level, i.e., the 

35 expression level in healthy tissue or bodily fluid from an individual not having the 
disorder. 
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The expression of this gene primarily in the embryo, indicates the eene plays a 
key role in embryo development and that the gene or the protein encoded bv the eene 
could be used in the treatment and or detection of developmental defects in the embryo 
or in infants. 

5 

FEATURES OF PROTEIN ENCODED BY GENE NO: 60 

This gene displays homology to nestin, an intermediate filament protein, the 
expression of which correlates with the proliferation of Central Nervous System 
progenitor cells and that is useful in the identification of brain tumors. This sene maps 
10 to chromosome i, and therefore, may be used as a marker in linkage analysis for 
chromosome 1 (See Accession No. AA527348). 

This gene is expressed primarily in kidney and to a lesser extent in brain. 

Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s) or cell type(s) present in a 
' 15 biological sample and for diagnosis of diseases and conditions, which include, but are 
not limited to, renal disorders and neurodegenerative conditions. Similarly, 
polypeptides and antibodies directed to these polypeptides are useful in providing 
immunological probes for differential identification of the tissue(s) or cell type(s). For 
a number of disorders of the above tissues or cells, particularly of the excretory and 
20 nervous systems, expression of this gene at significantly higher or lower levels may be 
routinely detected in certain tissues (e.g., cancerous and wounded tissues) or bodily 
fluids (e.g., serum, plasma, urine, synovial fluid or spinal fluid) or another tissue or 
cell sample taken from an individual having such a disorder, relative to the standard 
gene expression level, i.e., the expression level in healthy tissue or bodily fluid from an 
25 individual not having the disorder. Preferred epitopes include those comprising a 
sequence shown in SEQ ID NO: 293 as residues: Thr-128 to Asn-135. 

The tissue distribution and homology to nestin indicates that polynucleotides 
and polypeptides corresponding to this gene are useful for detection and/or treatment of 
neurodegenerative disease states and behavioral disorders such as Alzheimer's Disease, 
30 Parkinson's Disease, Huntingtons Disease, schizophrenia, mania, dementia, paranoia, 
obsessive compulsive disorder and panic disorder. In addition, its abundance in kidney 
indicates that it is useful in the treatment and detection of acute renal failure and other 
disease states associated with the kidney. 



35 



FEATURES OF PROTEIN ENCODED BY GENE NO: 61 

Gene shares homology with the latrophilin-related protein 1 precursor as well as 
the calcium-independent alpha-latrotoxin receptor. Preferred polypeptide fragments : 
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comprise the following amino acid sequence: 

IYKVFRHTAGLKyEVSCFENIRSCARXXXXXXXXXXXXWIFGVLHVVHASVV 
TA^XF^VSNAFQG^IFIFLFLCVLSRKIQEEYYRLFKNVPCC (SEQ ID NO:51S); 
WIFGVLHWHASVVTAYL^ 

5 C (SEQ ID NO:519). Also preferred are polynucleotide fragments encoding these 
polypeptide fragments. (See Accession No. 2213659) The translation product of this 
gene shares sequence homology with CD 97, a seven transmembrane bound receptor. 
This gene is expressed primarily in infant brain and in endothelial cells. 
Therefore, polynucleotides and polypeptides of the invention are useful as 
.10 reagents for differential identification of the tissue(s) or cell type(s) present in a 

biological sample and for diagnosis of diseases and conditions, which include, but are 
not limited to, neurological disorders and hematopoeitic disorders. Similarly, 
polypeptides and antibodies directed to these polypeptides are useful in providing 
immunological probes for differential identification of the tissue(s) or cell type(s). For 

15 a number of disorders of the above tissues or cells, particularly of the neurological and 
hematopoeitic systems, expression of this gene at significantly higher or lower levels 
may be routinely detected in certain tissues (e.g., cancerous. and wounded tissues) or 
bodily fluids (e.g., serum, plasma, urine, synovial fluid or spinal fluid) or another 
t tissue or cell sample taken from an individual having such a disorder,, relative to the 

20 standard gene expression level, i.e., the expression level in healthy tissue or bodily 
fluid from an individual not having the disorder. Preferred epitopes include those 
comprising a sequence shown in SEQ ID NO: 294 as residues: Lys-13 to Leu-21. 

The tissue distribution of this gene suggest that it may be useful in the detection 
and/or treatment of neurodegenerative disease states and behavioral- disorders such as 

25 Alzheimer's Disease, Parkinson's Disease, Huntingtons Disease, schizophrenia, mania, 
dementia, paranoia, obsessive compulsive disorder and panic disorder, while its 
expression in hematopoietic cell types indicates that the gene could be important for the 
treatment or detection of immune or hematopoietic disorders including arthritis, asthma 
and immunodeficiency diseases. 

30 ^ 

FEATURES OF PROTEIN ENCODED BY GENE NO: 62 

. This gene is expressed primarily in fetal liver and fetal spleen. 
Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differendal idendfication of the tissue(s) or cell type(s) present in a 
35 biological sample and for diagnosis of diseases and conditions, which include, but are 
not limited to, hematological and immunological disorders. Similarly, polypeptides and 
antibodies directed to these polypeptides are useful in providing immunological probes 
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for differencial identification of the tissue(s) or cell cype(s). For a number of disorders 
of the above tissues or cells, particularly of the immune and hematopoetic systems/ 
expression of this gene at significantly higher or lower levels may be routinely detected 
in certain tissues (e.g., cancerous and wounded tissues) or bodily fluids (e.g., serum, 
5 plasma, urine, synovial fluid or spinal fluid) or another tissue or cell sample taken from 
an individual having such a disorder, relative to the standard gene expression level, i.e., 
the expression level in healthy tissue or bodily fluid from an individual not having the 
disorder. Preferred epitopes include those comprising a sequence shown in SEQ ID 
NO: 295 as residues: Ser-91 to Lys-98. 
10 The tissue distribution of this gene fetal liver and spleen indicates that the gene 

could be important for the treatment or detection of immune or hematopoietic disorders 
including arthritis, leukemia, asthma and Immunodeficiency diseases. 

FEATURES OF PROTEIN ENCODED BY GENE NO: 63 
15 Gene shares homology with human serum amyloid protein. Preferred polypeptide 
fragments comprise the following amino acid sequence: 

ALTRIPPGDWVINVTAVSFAGKTTARFFHSSPPSLGDQARTDPGHQRRD (SEQ 
ID NO:520) (See Accession No. W 13671). Also preferred are polynucleotide 
fragments encoding these polypeptide fragments This gene maps to chromosome 9, and 

20 therefore, may be used as a marker in linkage analysis for chromosome 9 (See 
Accession No. AA004342). 

This gene is expressed primarily in fetal liver and spleen. 
Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s) or cell type(s) present in a 

25 biological sample and for diagnosis of diseases and conditions, which include, but are 
not limited to, hematopoietic and immune disorders. Similarly, polypeptides and 
antibodies directed to these polypeptides are useful in providing immunological probes 
for differential identification of the tissue(s) or cell type(s). For a number of disorders 
of the above tissues or cells, particularly of the hematopoietic and immune systems, 

30 expression of this gene at significantly higher or lower levels may be routinely detected 
in certain tissues (e.g., cancerous and wounded tissues) or bodily fluids (e.g., serum, 
plasma, urine, synovial fluid or spinal fluid) or another tissue or cell sample taken from 
an individual having such a disorder, relative to the standard gene expression level, i.e., 
the expression level in healthy tissue or bodily fluid from an individual not having the 

35 disorder. 
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The tissue distribution of this gene in fetal liver-spleen indicates that the gene 
could be important for the treatment or detection of immune or hematopoietic disorders 
including arthritis, leukemia, asthma, and immunodeficiency diseases. 

5 FEATURES OF PROTEIN ENCODED BY GENE NO: 64 

This gene maps to chromosome 3, and therefore, may be used as a marker in 
linkage analysis for chromosome 3 (See Accession No. AA2 19669). 
This gene is expressed specifically in the brain. 

Therefore, polynucleotides and polypeptides of the invention are useful as 
10 reagents for differential identification of the tissue(s) or cell type(s) present in a 

biological sample and for diagnosis of diseases and conditions, which include, but are 
not limited to, neurodegenerative disease states. Similarly, polypeptides and antibodies 
directed to these polypeptides are useful in providing immunological probes for. 
differential identification of the tissue(s) or cell type(s). For a number of disorders of 
15 the above tissues or cells, particularly of the neurological systems, expression of this 
gene at significantly higher or lower levels may be routinely detected in certain tissues 
(e.g., cancerous and wounded tissues) or bodily fluids (e.g., serum, plasma, urine, 
synovial fluid or spinal fluid) or another tissue or cell sample taken from an individual 
having such a disorder, relative to the standard gene expression level, i.e., the 
20 expression level in healthy tissue or bodily fluid from an individual not having the 
disorder. 

The tissue distribution indicates that polynucleotides and polypeptides 
corresponding to this gene are useful for the detection/treatment of neurodegenerative 
disease states and behavioral disorders such as Alzheimer's Disease, Parkinson's 
25 Disease, Huntintons Disease, schizophrenia, mania, dementia, paranoia, obsessive 
•compulsive disorder and panic disorder. 

FEATURES OF PROTEIN ENCODED BY GENE NO: 65 

Gene shares homology with a yeast protein. Preferred polypeptide fragments 
30 comprise the following amino acid sequence: LQEVNTTLPENSVWYERYKFDIP 

VFHL (SEQ ID NO:52l). Also preferred are polynucleotide fragments encoding these 

polypeptide fragments. (See Accession No. 1332638) 

This gene is expressed primarily in fetal tissue (fetus and fetal liver). 
Therefore, polynucleotides and polypeptides of the invention are useful as 
35 reagents for differential identification of the tissue(s) or cell type(s) present in a 

biological sample and for diagnosis of diseases and conditions, which include, but are 

.not limited to, liver disorders and cancers (e.g. hepatoblastoma). Similarly, 
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polypeptides and antibodies directed to these polypeptides are useful in providing 
immunological probes for differential identification of the tissue(s) or cell type(s). For 
a number of disorders of the above tissues or cells, particularly of the hepatic system, 
expression of this gene at significantly higher or lower levels may be routinely detected 
5 in certain tissues (e.g., cancerous and wounded tissues) or bodily fluids (e.g., serum, 
plasma, urine, synovial- fluid or spinal fluid) or another tissue or cell sample taken from 
an individual having such a disorder, relative to the standard gene expression level, i.e., 
the expression level in healthy tissue or bodily fluid from an individual not having the 
disorder. Preferred epitopes include those comprising a sequence shown in SEQ ID 

10 NO: 298 as residues: Asn-59 to Glu-64. 

The tissue distribution indicates that' polynucleotides and polypeptides 
corresponding to this gene are useful for the detection and treatment of liver disorders 
and cancers (e.g. hepatoblastoma, jaundice, hepatitis, liver metabolic diseases and 
conditions that are attributable to the differentiation of hepatocyte progenitor cells). In 

15 addition the expression in fetus would suggest a useful role for the protein product in 
developmental abnormalities, fetal deficiencies, pre-natal disorders and various would- 
healing models and/or tissue trauma. 

FEATURES OF PROTEIN ENCODED BY GENE NO: 66 

20 Gene has homology with a B-cell surface antigen which may indicate gene plays 

a role in the immune response, including, but not limited to disorders and infections of 
the immune system. Preferred polynucleotide fragments comprise the following 
sequence: TAGCATGTAGCCAGTCGAATAACNTATAAGGACAAAGTGGAGTC 
CACGCGTGCGGCCGTCTAGACTAGTGGATCCCCCGGCTGCAGGATTCGGC 

25 ACGAG (SEQ ID NO:523). "Also preferred are polypeptide fragments encoded by 
these polynucleotide fragments (See Accession No.T94535). Additionally, this gene 
shares homology with an interferon-gamma receptor. Preferred polypeptide fragments 
also comprise the following amino acid sequence: MQGSGSQFRACLLCLCFSCPC 
SPGGPRWNSRQGGRRFPKTCRAISQNLVFKYKTFCPVRYMQPHRSSLCLHFTS 

30 YVFILSTWGSLRTYSTDLKKKKKNSRGGPVPIRPKS (SEQ ID NO:522); 

MQGSGSQFRACLLCLCFSCPCSPGGPRWNSRQGGRRFPKTCRAISQNLVFK 
(SEQ ID NO:524); PVRYMQPHRSSLCLHFTSYVFILSTWGSLRTYSTDLKKKKK 
NSRGGPVPIRPKS (SEQ ID NO:525); and GEEQRDCSLGWRGVGMRATHCQAA 
RMFVLFSLPKYAGL (SEQ ID NO:526). Also preferred are polynucleotide fragments 

35 encoding these polypeptide- fragments 

This gene is expressed primarily in T-cells and gall bladder. 
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Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of che tissue(s) or cell type(s) present in a 
biological sample and for diagnosis of diseases and conditions, which include, but are 
not limited to, immunological disorders and conditions (immunodeficiencies, cancer, 
5 leukemia, hematopoeisis). Similarly, polypeptides and antibodies directed to these 

polypeptides are useful in providing immunological probes for differential identification 
of the tissue(s) or cell type(s). For a number of disorders of the above tissues or cells, 
particularly of the .immune and digestive systems, expression of this gene at 
significantly higher or lower levels may be routinely detected in certain tissues (e.g., 

10 cancerous and wounded tissues) or bodily fluids (e.g., serum, plasma, urine, synovial 
fluid or spinal fluid) or another tissue or. cell sample taken from an individual having 
such a disorder, relative to the standard gene expression level, i.e., the expression level 
in healthy tissue or bodily fluid from an individual not having the disorder. Preferred 
epitopes include those comprising a sequence shown in SEQ ID NO: 299 as residues: 

15 Thr-41 toGiy-52. 

The tissue distribution indicates that polynucleotides and polypeptides . 
corresponding to this gene are useful for the treatment and diagnosis of immune 
disorders including: leukemias, lymphomas, auto-immune disorders, immuno- 
supressive (transplantation) and immunodeficiencies (e.g. AIDS), inflammation and 

20 hematopoeitic disorders. The expression of this gene in gall bladder would suggest a 
possible role for this gene product in digestive disorders, particularly of the pancreas. 

FEATURES OF PROTEIN ENCODED BY GENE NO: 67 

This gene maps to chromosome 1 1, and therefore, may be used as a marker in 
25 linkage analysis for chromosome 1 1 (See Accession No. AA01 1622). 

This gene is expressed primarily in a variety of fetal and developmental tissues 
(e.g. fetal spleen, infant brain). 

Therefore, polynucleotides and polypepddes of the invention are useful as 
reagents for differential identification of the tissue(s) or cell type(s) present in a 
30 biological sample and for diagnosis of diseases and conditions, which include, but are 
not limited to, developmental, immune or neurological abnormalities. Similarly, 
polypepddes and antibodies directed to these polypepddes are useful in providing 
immunological probes for differential identification of the tissue(s) or cell type(s). For 
a number of disorders of the above tissues or cells, particularly of the developing 
35 immune and central nervous systems, expression of this gene at significantly higher or 
lower levels may be routinely detected in certain tissues (e.g., cancerous and wounded 
tissues) or bodily fluids (e.g., serum, plasma, urine, synovial fluid or spinal fluid) or 



WO 98/54963 



PCIYUS98/1I422 



another tissue or cell sample taken from an individual having such a disorder, relative to 
the standard gene expression level, i.e., the expression level in healthy tissue or bodily 
fluid from an individual not having the disorder. Preferred epitopes include those 
comprising a sequence shown in SEQ ID NO: 300 as residues: SeroS to Ser-43. 

5 The tissue distribution indicates that polynucleotides and polypeptides 

corresponding to this gene are useful for developmental abnormalities or fetal 
deficiencies. The detection in infant brain would, suggest a role in neurological disorders 
(both developmental and neurodegenerative conditions of the brain and nervous system, 
behavioral disorders, depression, schizophrenia, Alzheimer's disease, Parkinson's 

10 disease, Huntington's disease,. mania, dementia). In addition, the detection in spleen 
would similarly suggest a role in detection and treatment of immunologically mediated 
disorders (e.g. immunodeficiency, inflammation, cancer, wound healing, tissue repair, 
hematopoeisis). 

15 FEATURES OF PROTEIN ENCODED BY GENE NO: 68 

This gene is expressed primarily in spleen. T -cells, and fetal heart. 
Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s) or cell type(s) present in a 
biological sample and for diagnosis of diseases and conditions, which include, but are 

20 not limited to, immunological deficiencies, including AIDS and cardiovascular 

disorders. Similarly, polypeptides and antibodies directed to these polypeptides are 
useful in providing immunological probes for differential identification of the tissue(s) 
or cell type(s). For a number of disorders of the above tissues or cells, particularly of 
the immune and cardiovascular systems, expression of this gene at significantly higher 

25 or lower levels may be routinely detected in certain tissues (e.g., cancerous and 

wounded tissues) or bodily fluids (e.g., serum, plasma, urine, synovial fluid or spinal 
fluid) or another tissue or cell sample taken from an individual having such a disorder, 
relative to the standard gene expression level, i.e., the expression level in healthy tissue 
or bodily fluid from an individual not having the disorder. 

30 The tissue distribution indicates that polynucleotides and polypeptides 

corresponding to this gene are useful for the diagnosis and treatment of immune 
disorders including: leukemias, lymphomas, autoimmune disorders, 
immunodeficiencies (e.g. AIDS), immunosuppressive conditions (transplantation) and 
hematopoeitic disorders. The expression in fetal, heart indicates that polynucleotides and 

35 polypeptides corresponding to this gene are useful for the treatment and diagnosis of 
cadiovascular disorders (e.g. heart disease, restenosis, atherosclerosis, stoke, angina, 
thrombosis). 
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FEATURES OF PROTEIN ENCODED BY GENE NO: 69 

Gene shares homology wich a human collagen protein. Preferred polypeptide 
fragments comprise the following amino acid sequence: 
5 MPRKTSKCRQLLCSGASRNADTAARQSTCSSHRPPGIOPSLGPRRXPGCXSVp 
SSRGEQSTGSPAAPRCGRRDAHRGLPGGAAMTPGDTWASFNPRAGHSKSQGE 
GQESSGASRQDRHPVSHWVERQREAWGAPRSSSAGGVKVAATTEREPEFKIK 
TGKA (SEQ ID NO:527); CS GAS RN ADT A AR QSTCS S HRP PG KIPS LGPRRXPG 

CXSVPSSRGEQSTGSPAAPRCGRRDAHRGLPGGAAMTPGDTWASFNPRAGHS 
10 (SEQ ID NO:528); QGEGQESSGASRQDRHPVSHWVERQREAWGAPRSSSAGG 
VKVAATTEREPEFKIKTGKA (SEQ ID NO:529) (See Accession No. 124886), Also 
preferred are polynucleotide fragments encoding these polypeptide fragments 
This gene is expressed primarily in fetal heart. 

Therefore, polynucleotides and polypeptides of the invention are useful as 

15 reagents for differential identification of the tissue(s) or cell type(s) present in a 

biological sample and for diagnosis of diseases and conditions, which include, but are 
not limited to, cardiovascular disorders. Similarly, polypeptides and antibodies directed 
to these polypeptides are useful in providing immunological probes for differential 
identification of the tissue(s) or cell type(s). For a number of disorders of the above 

20 tissues or cells, particularly of the cardiovascular system, expression of this gene at 
significantly higher or lower levels may be routinely detected in certain tissues (e.g., 
cancerous and wounded tissues) or bodily fluids (e.g., serum, plasma, urine, synovial 
fluid or spinal fluid) or another tissue or cell sample taken from an individual having 
such a disorder, relative to the standard gene expression level, i.e., the expression level 

25 in healthy tissue or bodily fluid from an individual not having the disorder. Preferred 
epitopes include those comprising a sequence shown in SEQ ID NO: 302 as residues: 
Pro-32 to Ser-39. 

The tissue distribution indicates that polynucleotides and polypeptides 
corresponding to this gene are useful for the treatment and diagnosis of cadiovascular 

30 disorders (e.g. heart disease, restenosis, atherosclerosis, stroke, angina, thrombosis). 

FEATURES OF PROTEIN ENCODED BY GENE NO: 70 

The translation product of this gene shares sequence homology with a chicken 
single-strand DNA-binding protein. Preferred polypeptide fragments comprise the 
35 following amino acid sequence: 

MSPRYPGGPRPPLRIPNQALGGVPGSQPLLPSGMDPTRQQGHPNMGGPMQRM 
TPPRGMVPLGPQNYGG^IRPPLNALGGPGMPGMNMGPGGGRPWPNPTNAN 
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SIPYSSASPGNYVGPPGGGGPPGTPIiMPSPADSTNSGDNMYTLiViNAVPPGPNR 
PNFPMGPGSDGPMGGLGGMESHHMNGSLGSGDMDSISKNSPNNMSLSNQP 
GTPRDDGEMGGNFLNPFQSESYSPSMTMSV (SEQ ID NO:530); MSPRYPGG 
PRPPLRIPNQALGGVPGSQPLLPSGNIDPTRQQGHPNMGGPMQRJViTPPRGMVP 
LGPQNYGGAMRPPLNALGGPG^IPG^INMGPGGGRPWPNPTNANSIPYSSASP 
GNY (SEQ ID. NO:531); LNALGGPGMPGMNMGPGGGRPWPNPTNANSIPYSS 
ASPGNYVGPPGGGGPPGTPIMPSPADSTNSGDNMYTLMNAVPPGPN (SEQ ID 
NO:532); GPMGGLGGiVIESHHMNGSLGSGDMDSISKNSPNNMSLSNQPGTPR 
DDGEMGGNTLNPFQSESYSPSMTMSV (SEQ ID NO:533); TCEHSSEAKAFHDY 
(SEQ ID NO:534). .Also preferred are polynucleotide fragments encoding these 
polypeptide fragments. (See Accession No. 1562534) 

This gene is expressed primarily in placenta and to a lesser extent in the fetal 
heart and a variety of other .tissues and cell types. 

Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s) or cell type(s) present in a 
biological sample and for diagnosis of diseases and conditions, which include, but are 
not limited to, developmental abnormalities, fetal deficiencies, and particularly of the 
cardiovascular system. Similarly, polypeptides and antibodies directed to these 
polypeptides ate useful in providing immunological probes for differential identification 
of the tissue(s) or cell type(s). For a number of disorders of the above tissues or cells, 
particularly of the reproductive system, expression of this gene at significantly higher or 
lower levels may be routinely detected in certain tissues (e.g., cancerous and wounded 
tissues) or bodily fluids (e.g., serum, plasma, urine, synovial fluid or spinal fluid) or 
another tissue or cell sample taken from an individual having such a disorder, relative to 
the standard gene expression level, i.e., the expression level in healthy tissue or bodily 
fluid from an individual not having the disorder. 

The tissue distribution indicates that polynucleotides and polypeptides 
corresponding. to this gene are useful for the detection and treatment of developmental 
abnormalities or fetal deficiencies, ovarian and other endometrial cancers, reproductive 
dysfunction, cardiovascular disorders, and pre-natal disorders. 

FEATURES OF PROTEIN ENCODED BY GENE NO: 71 

This gene is expressed primarily in fetal liver and to a lesser extent in the breast 
and testes. 

Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s) or cell, type(s) present in a 
biological sample and for diagnosis of diseases and conditions, which include, but are 
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not limited to, liver disorders (including hepatoblastomas) and reproductive disorders. 
Similarly, polypeptides and antibodies directed to these polypeptides are useful in 
providing immunological probes for differential identification of the dssue(s) or cell 
type(s). For a number of disorders of the above tissues or cells, particularly of the 
5 hepatic and reproductive systems, expression of this gene at significantly higher or 
lower levels may be routinely detected in certain tissues (e.g., cancerous and wounded 
tissues) or bodily fluids (e.g., serum, plasma, urine, synovial fluid or spinal fluid) or 
another tissue or cell sample taken from an individual having such a disorder, relative to 
the standard gene expression level, i.e., the expression level in healthy tissue or bodily 

10 fluid from an individual not having the disorder. 

The tissue distribution indicates that polynucleotides and polypeptides 
corresponding to this gene are useful for detection and treatment of liver disorders and 
cancers (e.g. hepatoblastoma, jaundice, hepatitis, liver metabolic diseases and 
conditions that are attributable to the differentiation of hepatocyte progenitor cells). The 

15 expression in testes and breast indicates that polynucleotides and polypeptides 

corresponding to this gene are useful for the detection and treatment of endocrine and 
reproductive disorders (e.g. sperm maturation, milk production, testicular and breast 
cancers). 

20 FEATURES OF PROTEIN ENCODED BY GENE NO: 72 

This gene maps to chromosome I, and therefore, may.be used as a marker in 
linkage analysis for chromosome i (See Accession No. W93595). 

This gene is expressed primarily in smooth muscle and to a lesser extent in 

brain. 

25 Therefore, polynucleotides and polypeptides of the invention are useful as 

reagents for differential identification of the tissue(s) or cell type(s) present in a 
biological sample and for diagnosis of diseases and conditions, which include, but are 
not limited to, cardiovascular and neurological disorders. Similarly, polypeptides and 
antibodies directed to these polypeptides are useful in providing immunological probes 

30 for differential identification of the tissue(s) or cell type(s). For a number of disorders 
of the above tissues or cells, particularly of the cardiovascular and central nervous 
systems, expression of this gene at significantly higher or lower levels may be routinely 
detected in certain tissues (e.g., cancerous and wounded tissues) or bodily fluids (e.g., 
serum, plasma, urine, synovial fluid or spinal fluid) or another tissue or cell sample 

35 taken from an individual having such a disorder, relative to the standard gene 

expression level, i.e., the expression level in healthy tissue or bodily fluid from an 
individual not having the disorder. 
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The tissue distribution indicates that polynucleotides and polypeptides . 
corresponding to this gene are useful for the detection and treatment of restenosis, 
atherosclerosis, stroke, angina, thrombosis, wound healing and other conditions of 
heart disease. In addition, the expression in brain would suggest that polynucleotides 
and polypeptides corresponding to this gene are useful for the detection and treatment of 
developmental, degenerative and behavioral conditions of the brain and nervous system 
(e.g. schizophrenia, depression, Alzheimer's disease, Parkinson's disease, 
Huntington's disease, mania, dementia, paranoia, addictive behavior and sleep 
disorders). 

FEATURES OF PROTEIN ENCODED BY GENE NO: 73 

Gene shares homology with human stromalin-2. Preferred polypeptide 
fragments comprise the following amino acid sequence: 

QAFVLLSDLLL1PSPQMIVGGRDFLRPLVFFPEATLQSELASFLMDHWIQPGDL 
GSGA (SEQ ID NO:535); ACSYLLCNPEFTFFSRADFARSQLVDLLTDRFQQE 
LEELLQVG (SEQ ID NO:536),QKQLSSLRDRMVAFCELCQSCLSDVDTEIQEQV 
ST (SEQ ID NO:537); Q VILP ALTL V YFS EL WTLTHIS KS D AS (SEQ ID NO:538); 
STHDLTR^LYEPCCQLLQIC^VDTGX (SEQ ID N 0:5 39). Also preferred . 

are polynucleotide fragments encoding these polypeptide fragments (See Accession 
No.R65208 ) This gene maps to chromosome 7, and therefore, may be used as a 
marker in linkage analysis for chromosome 7 (See Accession No. D52585). 

This gene is expressed primarily in the brain (infant brain, adult brain, pituitary, 
cerebellum, hippocampus, schizophrenic hypothalmus, amygdala). 

Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s) or cell type(s) present in a 
biological sample and for diagnosis of diseases and conditions, which include, but are 
not limited to, developmental and neurodegenerative diseases of the brain and nervous 
system. Similarly, polypeptides and antibodies directed to these polypeptides are useful 
in providing immunologicalprobes for differential identification of the tissue(s) or cell 
type(s). For a number of disorders of the above tissues or cells, particularly of the 
central nervous system, expression of this gene at significantly higher or lower levels 
may be routinely detected in certain tissues (e.g., cancerous and wounded tissues) or 
bodily fluids (e.g., serum, plasma, urine, synovial fluid or spinal fluid) or another 
tissue or cell sample taken from an individual having such a disorder, relative to the 
standard gene expression level, i.e., the expression level in healthy tissue or bodily 
fluid from an individual not having the disorder. Preferred epitopes include those 
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comprising a sequence shown in SEQ ID NO: 306 as residues: Thr-25 to Lys-36, Lys- 
55 to Ser-63. 

The tissue distribution indicates that polynucleotides and polypeptides 
corresponding to this gene are useful for detection and treatment of developmental, 
5 degenerative and behavioral conditions of the brain and nervous system (e.g. 

schizophrenia, depression, Alzheimer's disease, Parkinson's disease, Huntington's 
disease, mania, dementia, paranoia, addictive behavior and sleep disorders). 

FEATURES OF PROTEIN ENCODED BY GENE NO: 74 
10 This gene is expressed primarily in the hypothalamus of a human suffering from 

schizophrenia. 

Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differencial identification of the tissue(s) or cell type(s) present in a 
biological sample and for diagnosis of diseases and conditions, which include, but are 

15 not limited to, disorders of the CNS particularly schizophrenia. Similarly, polypeptides 
and antibodies directed to these polypeptides are useful in providing immunological 
probes for -differential identification of the tissue(s) or cell type(s). For a number of 
disorders of the above tissues or cells, particularly of the CNS, such as schizophrenia 
expression of this gene at significantly higher or lower levels may be routinely detected 

20 in certain tissues (e.g., cancerous. and wounded tissues) or bodily fluids (e.g., serum, 
plasma, urine, synovial fluid or spinal fluid) or another tissue or cell sample taken from 
. an individual having such a disorder, relative to the standard gene expression level, i.e., 
the expression level in healthy tissue or bodily fluid from an individual not having the 
disorder. Preferred epitopes include those comprising a sequence shown in SEQ ID 

25 NO: 307 as residues: Gly-33 to Ala-44. . 

The tissue distribution indicates that the protein products of this gene are useful 
for the study, diagnosis and treatment of schizophrenia and other disorders involving 
the CNS. 

30 FEATURES OF PROTEIN ENCODED BY GENE NO: 75 

Preferred polypepddes of the invention comprise the following amino acid 
sequence encoded by this gene: 

LAVSTSFICCADISTALPLGSSRPAPAPRHREHEHGHQARPPRLLXTSLMPLSTP 
AAAQLLWTQLTPMGGRPGGRHSPPTLHTGPRALPPGPPHPSLHVAALSLLR 
35 (SEQ ID NO:540). Polynucleotides encoding such polypepddes are also provided. 

This gene is expressed primarily in endometrial tumor and to a lesser extent in 
amniotic cells. 
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-Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s) or cell type(s) present in a 
biological sample and for diagnosis of diseases and conditions, which include, but are 
not limited to, reproductive and immune disorders particularly cancers of those 

5 systems. Similarly, polypeptides and antibodies directed to these polypeptides are 

useful in providing immunological probes for differential identification of the dssue(s) 
or cell type(s). For a number of disorders of the above tissues or cells, particularly of 
the reproductive and immune systems, expression of this gene at significantly higher or 
lower levels may be routinely detected in certain tissues (e.g., cancerous and wounded 

10 tissues) or bodily fluids (e.g., serum, plasma, urine, synovial fluid or spinal fluid) or 
another tissue or cell sample taken from an individual having such a disorder, relative to 
the standard gene expression level, i.e., the expression level in healthy tissue or bodily 
fluid from an individual not haying the disorder. Preferred epitopes include chose 
comprising a sequence shown in SEQ ID NO: 308 as residues: Ser-3 to Arg-9. 

15 The tissue distribution indicates that the protein products of this gene are useful 

for study and treatment of immune and reproductive disorders particularly cancers of 
those systems. 

FEATURES OF PROTEIN ENCODED BY GENE NO: 76 

20 This gene is expressed primarily in kidney cortex and to a lesser extent in early 

stage human brain. 

Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s) or cell type(s) present in a 
biological sample and for diagnosis of diseases and conditions, which include, but are 

25 not limited to, renal disorders such as renal cancer. Similarly, polypeptides and 

antibodies directed to these polypeptides are useful in providing immunological probes 
for differential identification of the tissue(s) or cell type(s). For a number of disorders 
of the above tissues or cells, particularly of the kidney expression of this gene at 
significantly higher or lower levels may be routinely detected in certain tissues (e.g., 

30 cancerous and wounded tissues) or bodily fluids (e.g., serum, plasma, urine, synovial 
fluid or spinal fluid) or another tissue or cell sample taken from an individual having 
such a disorder, relative to the standard gene expression, level, i.e., the expression level 
in healthy tissue or bodily fluid from an individual not having the disorder. Preferred 
epitopes include those comprising a sequence shown in SEQ ID NO: 309 as residues: 

35 GIy-38 to Gly-45, Gly-47 to Gly-52, Pro-92 to Lys-1 10. 

The tissue distribution indicates that the protein products of this gene are useful 
for study, treatment and diagnosis of renal diseases such as cancer of the kidney. . 
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FEATURES OF PROTEIN ENCODED BY GENE NO: 77 
This gene is expressed primarily in kidney medulla. 

Therefore, polynucleotides and polypeptides of the invention are useful as 
5 reagents for differential identification of the tissue(s) or cell rype(s) present in a 

biological sample and for diagnosis of diseases and conditions, which include, but are 
not limited to,, metabolic and renal disorders. Similarly, polypeptides and antibodies 
directed to these polypeptides are useful in providing immunological probes for 
differential identification of the tissue(s) or cell type(s). For a number of disorders of 
10 the above tissues or cells, particularly of the metabolic and renal systems, expression of 
this gene at significantly higher or lower levels may be routinely detected in certain 
tissues (e.g., cancerous and wounded tissues) or bodily fluids (e.g., serum, plasma, 

urine, synovial fluid or spinal fluid) or another tissue or cell sample taken from an 

> 

individual having such a disorder, relative to the standard gene expression level, i.e., 
15 - the expression level in healthy tissue or bodily fluid from an individual not having the 
disorder. 

The tissue distribution indicates that the protein products of this gene are useful 
for study, treatment and diagnosis of metabolic and renal diseases and disorders. 

20 FEATURES OF PROTEIN ENCODED BY GENE NO: 78 

This gene is expressed in chronic synovitis and microvascular endothelium. 
Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s) or cell type(s) present in a 
biological sample and for diagnosis of diseases and conditions, which include, but are 
25 not limited to, arthritis and atherosclerosis. Similarly, polypeptides and antibodies 
directed to these polypeptides are useful in providing immunological probes for 
differential identification of the tissue(s) or cell type(s). For a number of disorders of 
' the above tissues or cells, particularly of the vascular and skeletal systems, expression 
of this gene at significantly higher or lower levels may be routinely detected in certain 
30 tissues (e.g., cancerous and wounded tissues) or bodily fluids (e.g., serum, plasma, 
urine, synovial fluid or spinal fluid) or another tissue or cell sample taken from an 
individual having such a disorder, relative to the standard gene expression level, i.e., 
the expression level in healthy tissue or bodily fluid from an individual not having the 
disorder. 

35 The tissue distribution indicates that the protein products of this gene are useful 

for study, diagnosis and treatment of arthritic and other inflammatory diseases as well 
as cardiovascular diseases. 
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FEATURES OF PROTEIN ENCODED BY GENE NO: 79 

This gene is expressed in resting T-cells and activated monocytes. 

Therefore, polynucleotides and polypeptides of the invention are useful as 
5 reagents for differencial identification of the tissue(s) or cell type(s) present in a 

biological sample and for diagnosis of diseases and conditions, which include, but are 
not limited to, immune disorders. Similarly, polypeptides and antibodies directed to 
these polypeptides are useful in providing immunological probes for differential 
identification of the tissue(s) or cell type(s). For a number of disorders of the above 
10 tissues or ceils, particularly of the immune system, expression of this gene at 

significantly higher or lower levels may be routinely detected in certain tissues (e.g.. 
cancerous and wounded tissues) or bodily fluids (e.g., serum, plasma, urine, synovial 
fluid or spinal fluid) or another tissue or cell sample taken from an individual havins 
such a disorder, relative to the standard gene expression level, i.e., the expression level 
15 in healthy tissue or bodily fluid from an individual not having the disorder. 

The tissue distribution indicates that the protein products of this gene are useful 
for the study and treatment of immune diseases such as inflammatory conditions. 

FEATURES OF PROTEIN ENCODED BY GENE NO: 80 

20 This gene is expressed in a variety of immune system tissues, e.g., neutrophils, 

T-cells, and TNF induced epithelial and endothelial cells. 

Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s) or cell type(s) present in a 
. biological sample and for diagnosis of diseases and conditions, which include, but are 

25 not limited to, infectious and immune disorders. Similarly, polypeptides and antibodies 
directed to these polypeptides are useful in providing immunological probes for 
differential identification of the tissue(s) or cell type(s). For a number of disorders of 
the above tissues or cells, particularly of the immune and vascular systems, expression 
of this gene at significantly higher or lower levels may be routinely detected in certain 

30 tissues (e.g., cancerous and wounded tissues) or bodily fluids (e.g., serum, plasma, 
urine, synovial fluid or spinal fluid) or another tissue or cell sample taken from an 
individual having such a disorder, relative to the standard gene expression level, i.e., 
the expression level in healthy tissue or bodily fluid from an individual not having the 
disorder. Preferred epitopes include those comprising a sequence shown in SEQ ID 

35 NO: 313 as residues: Met-1 to Trp-6. 

• The tissue distribution indicates that the protein products of this gene are useful 
for study and treatment of infectious diseases, immune and vascular disorders.- 
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FEATURES OF PROTEIN ENCODED BY GENE NO: 81 
This gene is expressed in activated neutrophils. 

Therefore, polynucleotides and polypeptides of the invention are useful as 
5 reagents for differential identification of the tissue(s) or cell cype(s) present in a 

biological sample and for diagnosis of diseases and conditions, which include, but are 
not limited to, inflammation and other immune conditions. Similarly, polypeptides and 
antibodies directed to these polypeptides are useful in providing immunological probes 
for differential identification of the tissue(s) or cell rype(s). For a number of disorders 
10 of the above tissues or cells, particularly of the immune system, expression of this gene 
at significantly higher or lower levels may be, routinely detected in certain tissues (e.g., 
cancerous and wounded tissues)' or bodily fluids (e.g., serum, plasma, urine, synovial 
fluid or spinal fluid) or another tissue or cell sample taken from an individual having 
such a disorder, relative to the standard gene expression level, i.e., the expression level 
1 5 in healthy tissue or bodily fluid from an individual not having the disorder. 

The tissue distribution indicates that the 'protein products of this gene are useful 
for study and treatment of immune disorders. 

FEATURES OF PROTEIN ENCODED BY GENE NO: 82 

20 . This gene is expressed in neutrophils. 

Therefore, polynucleotides and polypeptides of the invention are useful as 
"reagents for differential identification of the tissue(s) or cell type(s) present in a 
biological sample and for diagnosis of diseases and conditions, which include, but are 
not limited to, inflammatory and other immune conditions. Similarly, polypeptides and 

25 antibodies directed to these polypeptides are useful in providing immunological probes 
for differential identification of the tissue(s) or cell type(s). For a number of disorders 
of the above tissues or cells, particularly of the immune system, expression' of this gene 
at significantly higher or lower levels may be routinely detected in certain tissues (e.g., 
cancerous and wounded tissues) or bodily fluids (e.g., serum, plasma, urine, synovial 

30 fluid of spinal fluid) or another tissue or cell sample taken from an individual having 

such a disorder, relative to the standard gene expression level, i.e., the expression level 
in healthy tissue or bodily fluid from an individual not having the disorder. Preferred 
epitopes include those comprising a sequence shown in SEQ ID NO: 315 as residues: 
Aia-83 toThr-91. * 

35 The tissue distribution indicates that the protein products of this gene are useful 

for study and treatment of immune disorders. 
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FEATURES OF PROTEIN ENCODED BY GENE NO: 83 

This gene is expressed in human neutrophils. 

Therefore, polynucleotides and polypeptides of the invention are useful as 
5 reagents for differential identification of the tissue(s) or cell type(s) present in a 

biological sample and for diagnosis of diseases and conditions, which include, but are 
not limited to ? inflammation and immune disorders. Similarly, polypeptides and 
antibodies directed to these polypeptides are useful in providing immunological probes 
for differential identification of the tissue(s) or cell type(s). For a number of disorders 

10 of the above tissues or cells, particularly of the immune and inflammatory system, 

expression of this gene at significantly higher or lower levels may be routinely detected 
in certain tissues (e.g., cancerous and wounded tissues) or bodily fluids (e.g., serum, ' 
plasma, urine, synovial fluid or spinal fluid) or another tissue or cell sampie taken from 
an individual having such a disorder, relative to the standard gene expression level, i.e., 

15 the expression level in healthy tissue or bodily fluid from an individual- not having the 
disorder. 

The tissue distribution indicates' that the protein products of this gene are useful 
for diagnosis and treatment of disorders of the inflammatory and immune systems. 

20 FEATURES OF PROTEIN ENCODED BY GENE NO: 84 
Tnis gene is expressed in human neutrophils. 

Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s) or cell type(s) present in a 
biological sample and for diagnosis of diseases and conditions, which include, but are 

25 not limited to, disorders of the inflammatory and immune systems. Similarly, 

polypeptides and antibodies directed to these polypeptides are useful in providing 
immunological probes for differential identification of the dssue(s) or cell type(s). For 
a number of disorders of the above tissues or cells, particularly of the inflammatory and 
immune systems, expression of this gene at significantly higher or lower levels may be 

30 routinely detected in certain tissues (e.g., cancerous and wounded tissues) or bodily 
fluids (e.g., serum, plasma, urine, synovial fluid or spinal fluid) or another tissue or 
cell sample taken from an individual having such a disorder, relative to the standard 
gene expression level, i.e., the expression level in healthy tissue or bodily fluid from an 
individual not having the disorder. 

35 The tissue distribution indicates that the protein products of this gene are useful 

for diagnosis and treatment of disorders of the immune and. inflammatory- systems. 
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FEATURES OF PROTEIN ENCODED BY GENE NO: 85 
This gene is expressed in activated neutrophils. 

Therefore, polynucleotides and polypeptides of the invention are useful as 
5 reagents for differential identification of the tissue(s) or cell type(s) present in. a 

biological sample and for diagnosis of diseases and conditions, which include, but are 
not limited to, inflammation and immune system diseases. Similarly, polypeptides and 
antibodies directed to these polypeptides are useful in providing immunological probes 
for differential identificadon of the tissue(s) or cell type(s). For a number of disorders 

10 of the above tissues or cells, particularly of the immune system and inflammatory 

system, expression of this gene at significantly higher or lower levels may be routinely 
detected in certain tissues (e.g., cancerous and wounded tissues) or bodily fluids (e.g., 
serum, plasma, urine, synovial fluid or spinal fluid) or another tissue or cell sample 
taken from an individual having such a disorder, relative to the standard gene 

15 expression level, i.e., the expression level in healthy tissue or bodily fluid from an 
individual not having the disorder. 

The tissue distribution indicates that the protein products of this gene 'are useful 
for diagnosis and treatment of diseases of the inflammatory and immune systems. 

20 FEATURES OF PROTEIN ENCODED BY GENE NO: 86 
This gene is expressed in activated neutrophils. 

Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s) or cell type(s) present in a 
biological sample and for diagnosis of diseases and conditions", which include, but are 

25 not limited to, inflammation and immune system disorders. Similarly, polypeptides and 
antibodies directed to these polypeptides are useful in providing immunological probes 
for differential identification of the tissue(s) or cell type(s). For a number of disorders 
of the above tissues or cells, particularly of the inflammatory and immune system, 
expression- of this gene at significantly higher or lower levels may be routinely detected 

30.. in certain tissues (e.g., cancerous and wounded tissues) or bodily fluids (e.g., serum, 
plasma, urine, synovial fluid or spinal fluid) or another tissue or cell sample taken from 
an individual having such a disorder, relative to the standard gene expression level, i.e., 
the expression level in healthy tissue or bodily fluid from an individual not having the 
disorder. Preferred epitopes include those comprising a sequence shown in SEQ ID 

35 NO: 319 as residues: Met-1 to Gly-6, Giy-32 to Pro-43, Leu-55 to Gln-60. 

The tissue distribution indicates that the protein products of this gene are useful 
for diagnosis and treatment of disorders of the immune and inflammatory system. 
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FEATURES OF PROTEIN ENCODED BY GENE NO: 87 

In specific embodiment, polypeptides of the invention comprise the sequence: 

EQVLALLWPRPH-ILEMNVQSVRSTDPQRLGGLDTRPHYn'RRYAEFSSALVSIN 
5 QTTP^RTMQLIGQLQVEVENFVUW 

R^DDSKEVESFQQLLNARTQEFTEELLSPPFGGLVAFVKE^ALIERGQAERLR 
GEEARVTQLIRGFGSSWKSSVESLSQDVMRSFTNFRNGTSIIQG (SEQ ID 
NO:541),AlXKYRFFYQFlXGNFJLATAKEIRDEY\ r ETLSKIYTSYYRSYLGRLMK 
VQ^T£EV^KDDLMGVEDTAKKGFXSKPSRSRNTIFTLGTRGSVISPTELEAPILV 

10 PHTAQR (SEQ ID NO: 542): EQRYPFEALFRSQHYXI .1 .DNSCREYLFICEFFVVS 
GPX,\HDLFHAV^lGRTLSMTLKHLDSYL.\DCYDAlAVFLCIHrvXRFRNlAAKRD 
VPALDRYW (SEQ ID NO:543),GGLDTRPHYITRRYAEFSSALVSINQ (SEQ ID 
NO:544); SRKEQLVFLINNYDMMLGVL (SEQ ID NO: 545) and/or ALLKYRFFY 
QFLLGNTR^T.AKEIRDEYVETLSKIYXSYYRSYLGRLMKVQYEEV.AEKDDLMG 

15 VEDTAKKGFXSKPSLRSRNTIFTLGTRGSVISPTELEAPILVPHTAQRXEQRYPF 
EALFRSQHYXLLDNSCREYLFICEFFVVSGPXAHDLFHAVMGRTLSMTLKHLD 
SYTADCYDAIAVFLCMIVLRFRiNL^AKFUDVP.ALDR^VEQVLALLWPRFE 
NVQSVRSTDPQRLGGLDTRPHYITRRYAEFSSALVSINQTIPNERTMQLLGQLQV 
EVENT^RVAAEFSSRKEQLVFLINT^YDNL\ILGVLiVIER\ADDSKEVESFQQLLN 

20 ARTQEFffiELLSPPFGGLVAFVKEAEALIERGQAERLRGEEARVTQLIRGFGSSW 
KSSVESLSQDVMRSFTNFRNGTS (SEQ ID NO:546). Polynucleotides encoding 
these polypeptides are also encompassed by the invention. The translation product of 
this gene shares sequence homology with suppressor of actin mutation which is thought 
to be important in mutation suppression. 

25 This gene is expressed primarily in fetal liver and to a lesser extent in a variety 

of other tissues. 

Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s) or cell type(s) present in a 
biological sample and for diagnosis of diseases and conditions, which include, but are 

30 not limited to, liver and mutations. Similarly, polypeptides and antibodies directed to 
these polypeptides are useful in providing immunological probes for differential 
identification of the tissue(s) or cell type(s). For a number of disorders of the above 
tissues or cells, particularly of the liver or cancer, expression of this gene at 
significantly higher or lower levels may be routinely detected in certain tissues (e.g., 

35 cancerous and wounded tissues) or bodily fluids (e.g., serum, plasma, urine, synovial 
fluid or spinal fluid) or another tissue or cell sample taken from an individual having 
such a disorder, relative to the standard gene expression level, i.e., the expression level 
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in healthy tissue or bodily fluid from an individual not having the disorder. Preferred 
epitopes include those comprising a sequence shown in SEQ ID NO: 320 as residues: 
Val-53 to Arg-60, Thr-88 to Thr-94, Ala-142 to Ser-150, Gly-188 to Glu-196, Gly- 
208 to Ser-214, Thr-227 to Gly-232, Lys-279 to Phe-285. 
5 The tissue distribution and homology to suppressor of actin mutation suggest 

that polynucleotides and polypepudes corresponding to this gene are useful for 
diagnosis and of liver disorder or cancer. 



FEATURES OF PROTEIN ENCODED BY GENE NO: 88 
10 This gene maps to chromosome 9, and therefore can be used in linkage analysis 

as a marker for chromosome 9. In specific embodiments, polypeptides of the invention 
comprise the sequence: 

YEGKEFDYWSIDVNEGGPSYKLPYOTSDDPWLTAYNFLQKNDLNPNIFLDQVA 
KFIIDNTKGQMLGLGNPSFSDPFTGGGRYVPGSSGSSNTLPTADPFTGAGRYV' 

1 5 PGS AS MGTTMAG VDPFTGNS A YRS AAS KTMNI YFPKKE A VTFDQ ANFTQELG K 
LKELNGTAPEEKKLTEDDLILLEKILSLICNSSSEKPTVQQLQrLWKAINCPEDIV 
FPALDILRJLSIKHPS VNENFCNEKEGAQFSSHLINLLNPKGKPANQLLALRTFC 
NCFVGQAGQKLMMSQRESLMSHAIELKSGSNKNI (SEQ ID NO: 547); 
HIALATLALNYSVCFHKD (SEQ ID NO: 548): HNTEGKAQCLSLISTELE WQ 

20 DLEATFRLLVALGTLISDDSNAVQLAKS (SEQ ID NO:549); LGVDSQIKKYSS 

VS EP AKVSECCRFELNLL (SEQ ID NO:550); and/or YEGKEFDYVFSIDVNEGGPS 
YKLPYNTSDDPWLTAYNTT^QKNDLNPNIFLDQVAKFnDNTKGQMLGLGNPSFS 
DPFTGGGRYVPGSSGSSNTLPTADPFTGAGRYVPGSASMGTTMAGVDPFTGN 
SAYRSAASKT^lNIYT^KXEAVTFDQANPTQrLGKLKELNGTAPEEKKLTEDDLI 

25 LLEKILSLICNSSSEKFTVQQLQE-WK^A.INCPEDI\'TT > ALDILRLSIKHPSVNENFC 
NEK£GAQFSSHLINLLNPKGKPANQLLALRTFCNCFVGQAGQKLMMSQRESL 
MSHAlELKSGSNKNIHIALATL.ALNYSVCFTiKDHNIEGK.AQCLSLISTILEVVQD 
I^ATr^RLLVALGTLISDDSNAVQLAKSLGVDSQIKKYSSVSEPAKVSECCRFILN 
LL (SEQ ID NO:551). Polynucleotides encoding these polypepudes are also 

30 encompassed by the invention. These polypepudes share significant homology with 
phospholipase A2 activating protein which is thought to be important in signal 
transduction (see, e.g., Wang et al., Gene 161(2):237-241 (1995)). . 

This gene is expressed primarily in endothelial cells, to a less extent in placenta, 
endometrial stromal cells, osteosarcoma, testis tumor, muscle, and infant brain that are 

35 likely to be rich in blood vessles. 

Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s) or cell type(s) present in a 
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biological sample and for diagnosis of diseases and. conditions, which include, but are 
not limited to, disorders in vascular system, aberrent angiogenesis, tumor angiogenesis. 
Similarly, polypeptides and antibodies directed to these polypeptides are useful in 
providing immunological probes for differential identification of the tissue(s) or cell 
5 type(s). For a number of disorders of the above tissues or cells, particularly of the 
vascular system or tumors, expression of this gene at significantly higher or lower 
levels may be routinely detected in certain tissues (e.g., cancerous and wounded 
tissues) or bodily fluids (e.g., serum, plasma, urine, synovial fluid or spinal fluid) or 
another tissue or cell sample taken from an individual having such a disorder, relative to 

10 the standard gene expression-Jevel, i.e., the expression level in healthy tissue or bodily 
fluid from an individual not having the disorder. 

The tissue distribution of this gene in endothelial cells and several potential 
highly vascularized tissues and its homology to phospholipase A2 activating protein 
suggest that this gene may be involved in transducing signals for endothelial cells in " 

15 angiogenesis or vasculogenesis. 

FEATURES OF PROTEIN ENCODED BY GENE NO: 89 

In specific embodiments, polypeptides of the invention comprise the sequence: 
YPNQDGDEJIDQVLHEHIQRLSKVVTANHRALQ 
20 AYKTPRDKVQCILRMCSTIMI^ 

S TV Q YIS S FY AS CLS GEES YWWMQFT A A VE (SEQ ID NO:552); YPNQDGDILR 

DQVLHEHIQRLSKVVTANHRA^ 

CILRMCSTLMNTLSLANEDSVPGA^ 

SCLSGEES YWWMQFTAAVEFIKTI (SEQ ID NO:553); YPN QDGDILRDQ VL (SEQ 
25 ID NO:554); EAPWPSAQSEI (SEQ ID NO:555); PVLVFVLIKANP (SEQ ID 
NO:560); SGEESYWWMQFTAAVEFIKTI (SEQ ID NO:556); ADDFVPVLVF 
VLIKANPP (SEQ ID NO:557); YKTPRDKVQOL (SEQ ID NO:558); and/or 
GADDFVPVLVFVLIK (SEQ ID NO:559). The translation product of this gene shares 
sequence homology with human ras inhibitor and yeast VPS9p which is thought to be 
30 important in golgi vacuole transport. 

This gene is expressed primarily in T cells and melanocytes and to a lesser 
extent in a variety of other tissues. 

Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s) or cell type(s) present in a 
35 biological sample and for diagnosis of diseases and conditions, which include, but are 
not limited to, dysfunction and disorders involving T cells and melanocytes. Similarly, 
polypeptides and antibodies directed to these polypeptides are useful in providing 



WO 98/54963 



m 

PCT/US98/U422 



immunological probes for differential identification of the tissue(s) or cell type(s). For 
a number of disorders of the above tissues or cells, particularly of the immune system, 
expression of this gene at significandy higher or lower levels may be routinely detected 
in certain tissues (e.g., cancerous and wounded tissues) or bodily fluids (e.g., serum, 
5 plasma, urine, synovial fluid or spinal fluid) or another tissue or cell sample taken from 
an individual having such a disorder, relative to the standard gene expression level, i.e., 
the expression level in healthy tissue or bodily fluid from an individual not havins the 
disorder. 

The tissue distribution and homology to ras inhibitor indicates that 
10 poiynucieoddes and polypeptides corresponding to this gene are useful for regulating 
signal transduction; diagnosis and treatment of disorders involving T cells and 
melanocytes. 

FEATURES. OF PROTEIN ENCODED BY GENE NO: 90 
15 This gene maps to chromosome 9 and therefore polypeptides of the invention 

can be used in linkage analysis as a marker for chromosome 9. The translation product 
of this gene shares sequence homology with neuronal olfactomedin-reiated ER localized 
protein which is thought to be important in influence the maintenance, growth, or 
differentiation of chemosensory cilia on the apical dendrites of olfactory neurons. In 
20 specific embodiments, polypeptides of the invention comprise the sequence: 

SARASTQPPAGQHPGPC (SEQ ID NO:561 ); MPGRWRWQRDMHPARKLLSLL 
FULMGTELTQD (SEQ ID NO:562); SAAPDSLLRSSKGSTRGSL (SEQ ID 
NO:563); AAIVIWRGKSESRIAKTPGI (SEQ ID NO:564); FRGGGTLVLPPTHT 
PEWUL (SEQ ID NO:567); PLGITLPLGAPETGGGD (SEQ ID NO:565); and/or 
25 CAAETWKGSQRAGQLCALLA (SEQ ID NO:566). 
This gene is expressed in pineal gland. 

Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s) or cell type(s) present in a 
biological sample and for diagnosis of diseases and conditions, which include, but are 

30 not limited to, neurological and endocrinological disorders. Similarly, polypeptides and 
antibodies directed to these polypeptides are useful in providing immunological probes 
for differential identification of the tissue(s) or cell type(s). For a number of disorders 
of the above tissues or cells, particularly of the neurological or endocrine systems, 
expression of this gene at significantly higher or k>wer levels may be routinely detected 

35 in certain tissues (e.g., cancerous and wounded tissues) or bodily fluids (e.g., serum, 
plasma, urine, synovial fluid or spinal fluid) or another tissue or cell sample taken from 
an individual having such a disorder, relative to the standard gene expression level, i.e., 
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the expression level in healthy tissue or bodily fluid from an individual not haying the 
disorder. Preferred epitopes include chose comprising a sequence shown in SEQ ID 
NO: 323 as residues: Leu-20 to Ala-26, Arg-32 to Arg-39, Thr-104 to Gly-1 12. 

The tissue distribution and homology to olfactomedin-related protein indicates 
5 that polynucleotides and polypeptides corresponding to this gene are useful for 

• maintenance, growth, or differentiation of neuron cells in pineal gland, therefore, may 
be useful for diagnosis and treatment of neurological disorders in pineal gland. 

FEATURES OF PROTEIN ENCODED BY GENE NO: 91 
10 This gene is expressed primarily in prostate and apoptotic T cells. 1 

Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of. the tissue(s) or cell type(s) present in a 

• biological sample and for diagnosis of diseases and conditions, which include, but are 
not limited to, prostate disease and T cell dysfunction. Similarly, polypeptides and 

15 antibodies directed to these polypeptides are useful in providing immunological probes 
for differential' identification of the tissue(s) or cell type(s). For a number of disorders 
of the above tissues or cells, particularly of the prostate cancer, expression of this gene 
at significantly higher or lower levels may be routinely detected in certain tissues (e.g., 
cancerous and wounded ussues) or bodily fluids (e.g., serum, plasma, urine, synovial 

20 fluid or spinaJ fluid) or another tissue or cell sample taken from an individual having 

such a disorder, relative to the standard gene expression level, i.e., the expression level 

• in healthy tissue or bodily fluid from an individual not having the disorder. 

The tissue distribution indicates that polynucleoddes and polypeptides 
corresponding to this gene are useful for detect abnormal activity in prostate and T cells 
25 or probably treatment of this abnormality. 

FEATURES OF PROTEIN ENCODED BY GENE NO: 92 

This gene is expressed primarily in prostate and to a lesser extent in smooth 
muscle cells, fibroblasts, and placenta. 

30 Therefore, polynucleoddes and polypeptides of the invention are useful as 

reagents for differential identification of the tissue(s) or cell type(s) present in a 
biological sample and for diagnosis of diseases and conditions, which include, but are 
not limited to, disorders in prostate or vascular system. Similarly, polypeptides and 
antibodies directed to these poiypepddes are useful in providing immunological probes 

35 for differential identification of the tissue(s) or cell type(s). For a number of disorders 
of the above tissues or cells, particularly of the prosate or vascular system, expression 
of this gene at significantly higher or lower levels may be routinely detected in certain *. 
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tissues (e.g., cancerous and wounded tissues) or bodily fluids (e.g., serum, piasma, 
urine, synovial fluid or spinal fluid) or another tissue or cell sample taken from an 
individual having such a disorder, relative to the standard gene expression level, i.e., 
the expression level in healthy tissue or bodily fluid from an individual not havins the 
5 disorder. 

The tissue distribution indicates that polynucleotides and polypeptides 
corresponding to this gene are useful for regulating function of prostate or highly 
vascularized tissues, e.g. placenta. 

10 FEATURES OF PROTEIN ENCODED BY GENE NO: 93 

This gene is expressed primarily in embryos and fetal tissues stage human and 
to a lesser extent in a wide variety of other proliferative tissues. 

Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s) or cell type(s) present in a 

15 biological sample and for diagnosis of diseases and conditions, which include, but are 
not limited to, disorders in embryonic development and cell proliferation. Similarly, 
polypeptides and antibodies directed to these polypeptides are useful in providing 
immunological probes for differential identification of the tissue(s) or cell type(s). For 
a number of disorders of the above tissues or cells, particularly of the embryonic tissues 

20 and proliferative cells, expression of this gene at significantly higher or lower levels 
may be routinely detected in certain tissues (e.g., cancerous and wounded tissues) or 
bodily fluids (e.g., serum, plasma, urine, synovial fluid or spinal fluid) or another 
tissue or cell sample taken from an individual having such a disorder, relative to the 
standard gene expression level, i.e., the expression level in healthy tissue or bodily 

25 fluid from an individual not having the disorder. 

The tissue distribution indicates that polynucleotides and polypeptides 
corresponding ,to this gene are useful for diagnosis or treatment of abnormalities in 
developing and proliferative cells and organs. 

30 FEATURES OF PROTEIN ENCODED BY GENE NO: 94 

The translation product of this gene shares sequence homology with 
transformation related protein which is thought to be important in transformation. 

This gene is expressed primarily in female reproductive tissues, i.e., breast 
cancer cells, placenta, and ovary and to a lesser extent in fetal lung. 
35 Tnerefore, polynucleotides and polypeptides .of the invention are useful as 

reagents for differential identification of the tissue(s) or cell type(s) present in a 
biological sample and for diagnosis of diseases and conditions, which include, but are 
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not limiced to ; cancer or dysfunction of reproductive tissues. Similarly, polypeptides 
and antibodies directed to these polypeptides are useful in providing immunoloeical 
probes for differential identification of the tissue(s) or cell type(s). For a number of 
disorders of the above tissues or cells, particularly of the reproduction system, 
5 expression of this gene at significantly higher or lower levels may be routinely detected 
in certain tissues (e.g., cancerous and wounded tissues) or bodily fluids (e.g., serum, 
plasma, urine, synovial fluid or spinal fluid) or another tissue or cell sample taken from 
an individual having such a disorder, relative to the standard gene expression level, i.e., 
the expression level in healthy tissue or bodily fluid from an individual not having the 

10 disorder. Preferred epitopes include those comprising a sequence shown in SEQ ID 
NO: 327 as residues: Ser-50 to Pro-61. 

The tissue distribution and homology to transformation related protein indicates 
that polynucleotides and polypeptides corresponding to this gene are useful for 
diagnosis and treatment of conditions caused by transformation, i.e. tumorigenesis in 

15 reproductive organs, e.g. breast, placenta, and ovary. 

FEATURES OF PROTEIN ENCODED BY GENE NO: 95 

This gene is expressed primarily in testes, rhabdomyosarcoma, infant brain and 
to a lesser extent in some tumors and highly vascularized tissues. 

20 Therefore, polynucleotides and polypeptides of the invention are useful as 

reagents for differential identification of the tissue(s) or cell type(s) present in a 
biological sample and for diagnosis of diseases and conditions, which include, but are 
not limited to, tumorigenesis, abnormal angiogenesis, and/or neurological disorders. , 
Similarly, polypeptides and antibodies directed to these polypeptides are useful in 

25 providing immunological probes for differential identification of the tissue(s) or cell 
type(s). For a number of disorders of the above tissues or cells, particularly of the 
tumor tissues or vascular tissues, expression of this gene at significantly higher or 
lower levels may be routinely detected in certain tissues (e.g., cancerous and wounded 
tissues) or bodily fluids (e.g., serum, plasma, urine, synovial fluid or spinal fluid) or 

30' another tissue or cell sample taken from an individual having such a disorder, relative to 
the standard gene expression level, i.e., the expression level in healthy tissue or bodily 
fluid from an individual not having the disorder. Preferred epitopes include those 
comprising a sequence shown in SEQ ID NO: 328 as residues: Arg-46 to Trp-54, Pro- 
60 to Ile-69, Asn-116 to Ala-122, Arg-147 to Lys-153, Ser-158 to Glu-170, Ile-399 to 

35 Ser-405', Pro-486 to Met-499, Pro-502 to Asp-508. 

The tissue distribution indicates that polynucleotides and polypeptides 
corresponding to this gene are useful for a range of disease states including treatment of 
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cumor or vascular disorders and the treatment of neurological disorders such as 
Alzheimer's Disease, Parkinson's Disease, Huntingtons Disease, schizophrenia, mania, 
dementia, paranoia, obsessive compulsive disorder and panic disorder. 

FEATURES OF PROTEIN ENCODED BY GENE NO: 96 

This gene maps to chromosome 7 and therefore polynucleotides of the present 
invention can be used in linkage analysis as a marker for chromosome 7. The 
translation product of this gene is homologous to the Clostridium perfringens 
enterotoxin (CPE) receptor gene product and shares sequence homology with a human 
ORF specific to prostate and a glycoprotein specific to oligodendrocytes both of which 
are tissue specific proteins. (See e.g., Katahira et al., J Cell Biol. 136(6): 1239- 1247 
(1997). PMID: 9087440; UI: 97242441. 

- This gene is expressed primarily in pancreas tumor and ulcerative colitis and to a 
lesser extent m several tumors and normal tissues. 

Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s) or cell type(s) present in a 
biological sample and for diagnosis of diseases and conditions, which include, but are 
not limited to, pancreatic disorder, ulcerative colitis, tumors and food poisoning. 
Similarly, polypeptides and antibodies directed to these polypeptides are useful in 
providing immunological probes for differential identification of the tissue(s) or cell 
type(s). For a number of disorders of the above tissues or cells, particularly of the 
digestive system or tumorigenic system, expression of this gene at significantly higher 
or lower levels may be routinely detected in certain tissues (e.g., cancerous and 
wounded tissues) or bodily fluids (e.g., serum, plasma, urine, synovial fluid or spinal 
fluid) or another tissue or cell sample taken from an individual having such a disorder, 
relative to the standard gene expression level, i.e., the expression level in healthy tissue 
or bodily fluid from an individual not having the disorder. Preferred epitopes include 
those comprising a sequence shown in SEQ ID NO: 329 as residues: Gly-147 to Met- 
152, Cys-177 to Lys-188. 

The tissue distribution and homology to prostate and oligodendrocyte -specific . 
protein indicates that polynucleotides and polypeptides corresponding to this gene are 
useful for marker of diagnosis or treatment of disorder in pancreas, ulcerative colitis, 
and tumors. Furthermore, identity to the human receptor for Clostridium perfringenes 
entertoxin indicates that the soluble portion of this receptor could be used in the 
treatment of food poisoning associated with Clostridia perfringens by blocking the 
activity of perfringens enterotoxin. 
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FEATURES OF PROTEIN ENCODED BY GENE NO: 97 

The translation product of this gene shares sequence homology with ATPase 
which is thought to be important in metabolism. 
5 This gene is expressed primarily in testes and several hematopoietic cells and to 

a lesser extent in other tissues. 

Therefore, polynucleotides and polypeptides of the invendon are useful as 
reagents for differential identification of the tissue(s) or cell type(s) present in a 
biological sample and for diagnosis of diseases and conditions, which include, but are 
10 not limited to, leukemia and hematopoietic disorders. Similarly, polypeptides and 

antibodies directed to these polypeptides are useful in providing immunological probes 
for differential identification of the tissue(s) .or cell type(s). For a number of disorders 
of the above tissues or cells,; particularly of the hematopoietic system, expression of this 
gene at significantly higher or lower levels may be routinely detected in certain tissues 
15 (e.g., cancerous and wounded tissues) or bodily fluids (e.g., serum, plasma, urine, 
synovial fluid or spinal fluid) or another tissue or cell sample taken from an individual 
having such a disorder, relative to the standard gene expression level, i.e., the 
expression level in healthy tissue or bodily fluid from an individual not having the 
disorder. Preferred epitopes include those comprising a sequence shown in SEQ ID 
20 NO: 330 as residues: Leu-37 to Ala-42. 

The tissue distribution and homology to ATPase indicates that polynucleotides 
and polypeptides corresponding to this gene are useful for marker of diagnosis and 
treatment of leukemia and other hematopoietic disorders. 

25 FEATURES OF PROTEIN ENCODED BY GENE NO: 98 

In specific embodiments, polypeptides of the invention comprise the sequence: 
MRSARPSLGCLPSWAFSQALNI (SEQ ID NO:56S); LLGLKGL AP AEIS A VCE 
KGNFN (SEQ ID NO:569); VAHGLAWSYYIGYLRLILPELQARIR (SEQ ID 
NO:570): TYNQHYNNLLRGAVSQRC (SEQ ID NO:571); ILLPLDCGVPDNLSM 

30 ADPNERFLD KLPQQTGD RAGIKD R V YS N (SEQ ID NO:572); SIYELLENGQRAGT 
CVLEYATPLQTLFAMSQYSQAGFSGEDRLEQ (SEQ ID NO:573); AKLFCRTLE 
DILADAPESQNNCRLLAYQEPADDSSFSLSQEVL 

PSTSTMSQEPELUSGMEKPLPLRTDFS (SEQ ID NO:574); and/or LLGLKGLA 
PAEIS AVCEKGNFNVAHGLAWSYYIGYLRLILPEL (SEQ ID NO:575). 
35 Polynucleotides encoding these polypeptides are also encompassed by the invention. 

This gene is expressed primarily in prostate BPH and to a lesser extent in bone 
marrow. 
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Therefore, polynucieocides and polypeptides of the invention are useful as. 
reagencs for differencial identification of the tissue(s) or ceil type(s) present in a 
biological sample and for diagnosis of diseases and conditions, which include, but are 
not limited to, benign prostatic hypertrophy or prostate cancer. Similarly, polypeptides 

5 and antibodies directed to these polypeptides are useful in providing immunological 
probes for differential identification of the tissue(s) or cell cype(s). For a number of 
disorders of the above tissues or ceils, particularly of the male urinary system, 
expression of this gene at significantly higher or lower levels may be routinely detected 
in certain tissues (e.g., cancerous and wounded tissues) or bodily fluids (e.g., serum, 

10 plasma, urine, synovial fluid or spinal fluid) or another tissue or cell sample taken from 
an individual having such a disorder, relative to the standard gene expression level, i.e., 
the expression level in healthy tissue or bodily fluid from an individual not having the 
disorder. Preferred epitopes include those comprising a sequence shown in SEQ ID 
NO: 33 1 as residues: Ue-60 to Asn-69, Leu- 106 to Asp-1 12, Glu-130 to Gly-136, Phe- 

15 160 to GIu-167, Pro- 134 to Cys r 190, Glu-197 to Ser-202, Arg-215.to Glu-221, Thr- 
237 to Pro-242. 

The tissue distribution indicates that polynucleotides and polypeptides 
corresponding to this gene are useful for diagnosis or treatment of benign prostatic 
hypertrophy or prostate cancer. ■ 

20 

FEATURES OF PROTEIN ENCODED BY GENE NO: 99 
This gene is expressed primarily in salivary gland. 

Therefore, polynucleotides^arid polypeptides of the invention, are useful as 
reagents for differential identification of the tissue(s) or cell type(s) present in a 

25 biological sample and for diagnosis of diseases and conditions, which include, but are 
not limited to, disorders or injuries of the salivary gland. Similarly, polypeptides and 
antibodies directed to these polypeptides are useful in providing immunological probes 
for differential identification of the tissue(s) or cell type(s). For a number of disorders 
of the above tissues or cells, particularly of glandular tissues, expression of this gene at 

30 significantly higher or lower levels may be routinely detected in certain tissues (e.g., 
cancerous and wounded tissues) or bodily fluids (e.g., serum, plasma, urine, synovial 
fluid or spinal fluid) or another tissue or cell sample taken from an individual having 
such a disorder, relative to the standard gene expression level, i.e., the expression level 
in healthy tissue or bodily fluid from an individual not having the disorder. 

35- The tissue distribution indicates that polynucleotides and polypeptides* 

corresponding to this gene are useful for treatment of disorders of, or injuries to the 
salivary gland or other glandular tissue. 
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FEATURES OF PROTEIN ENCODED BY GENE NO: 100 

This gene maps to chromosome 15, accordingly, polynucleotides of the 
invention can be used in linkage analysis as a marker for chromosome 15. The 
5 translation product of this gene shares sequence homology with a C.elegans -gene of 
unknown function. In specific embodiments, polypeptides of the invention comprise 
the sequence: DPRVRLNSLTCKHIFISLTQ (SEQ ID NO:583); TMKLLKLRRNIV 
KLSLYRHFTN (SEQ ID NO:576); TLDLAVAASIvniWTTMKFRI (SEQ ID 
NO:577); VTCQSD^^^LWVDD.AIWRLLFS^IILFV'I (SEQ ID NO:57S); MVLWR 

10 PSANNQRFAFSPLSEEEEEDEQ (SEQ ID NO:580); fCEPMLKESFEGMKMRS 

TKQEPNGNSKVNKAQEDDL (SEQ ID NO:584); and/or KWVEENVPSSVTDVALP 
ALLDSDEERiVIITHFERSKME (SEQ ID NO:582). Polynucleotides encoding these 
polypeptides are also encompassed by the invention. 

This gene is expressed primarily in thyroid and to a. lesser extent in 
. 15 osteoclastoma, kidney medulla, and lung. 

Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s) or cell type(s) present in a 
biological sample and for diagnosis of diseases and conditions, which include, but are 
not limited to, thyroid dysfunction or cancer. Similarly, polypeptides and antibodies 

20 directed to these polypeptides are useful in providing immunological probes for 

differential identification of the tissue(s) or cell type(s). For a number of disorders of 
the above tissues or cells, particularly of the endocrine system, expression of this gene 
at significantly higher or lower levels may be routinely detected in certain tissues (e.g., 
cancerous and wounded tissues) or bodily fluids (e.g., serum, plasma, urine, synovial 

25 fluid or spinal fluid) or another tissue or cell sample taken from an individual having 

such a disorder, relative co the standard gene expression level, i.e., the expression level 
in healthy tissue or bodily fluid from an individual not having the disorder. Preferred 
epitopes include those comprising a sequence shown in SEQ ID NO: 333 as residues: 
Lys-107 to Leu-124, Giu-150 to Thr-159, Pro-173 to Asp-179, Ser-192 to Ser-201. 

30 The tissue distribution indicates that polynucleotides and polypeptides 

corresponding to this gene are useful for diagnosis and treatment of thyroid dysfunction 
or cancer. 

r 

FEATURES OF PROTEIN ENCODED BY GENE NO: 101 
35 This gene maps to chromosome 16, therefore polynucleotides of the invention 

can be used in linkage analysis as a marker for chromosome 16. In specific 
embodiments, polypeptides of the invention comprise the sequence: 
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IRHELTVLRDTRPACA (SEQ ID NO:585); and/or MDFXMALIYD (SEQ ID 
NO:586). Polynucleotides encoding these polypeptides are also encompassed by the 
invention. 

This gene is expressed primarily in kidney cortex and to a lesser extent in adult 
5 brain, corpus colosum, hippocampus, and frontal cortex. 

Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s) or cell type(s) present in a 
biological sample and for diagnosis of diseases and conditions, which include, but are 
not limited to, neurological disorders. Similarly; polypeptides and antibodies directed to 
10 these polypeptides are useful in providing immunological probes for differential 

identification of the tissue(s) or ceil type(s). For a number of disorders of the above 
tissues or cells, particularly of the central nervous system, expression of this gene at 
significantly higher or lower levels may be routinely detected in certain tissues (e.g., 
cancerous and wounded tissues) or bodily fluids (e.g., serum, plasma, urine, synovial 
15 fluid or spinal fluid) or another tissue or cell sample taken from an individual having 

such a disorder, relative to the standard gene expression level, i.e., the expression level 
in healthy tissue or bodily fluid from an individual not having the disorder. 

The tissue distribution indicates that polynucleotides and polypeptides 
corresponding to this gene are useful for treatment or diagnosis of neurological 
20 disorders. 

FEATURES OF PROTEIN ENCODED BY GENE NO: 102 

in specific embodiments, polypeptides of the invention comprise the sequence: 
MQEMMRN QDRALSNLESTPGG YN A (SEQ ID NO:5S7); LRRMYTDIQEPMLSA 

25 AQEQF GGNPF (SEQ ID NO:583): ASLVSNTSSGEGSQPSRTENRDPLPNPWAP 
QT.(SEQ ID NO:589); SQSSSASSGTASTVGGTTGSTASGTSGQSTTAPNLVPGV 
GASMFNTPG MQSLLQQITENPQLMQNMLSAPY (SEQ ID NO:590); 
MRS^LVIQSLSQNPDLA.AQ^1^ILN^NTLFAGNPQLQEQ^1RQQLPTFLQQ (SEQ ID 
NO:59 1 ); MQNPDTLS AMSNPFLAiMQALLQIQQGLQTLATEAPGLIPGFTPGLG 

30 ALGSTGGSSGTNGSNATPSENTSPTAGT (SEQ ID NO:592); TEPGHQQFI 
QQMLQALAGVM^QLQNPEVRFQQQ 

IERLLGSQPS (SEQ ID NO:593); RNPAMMQEMMRNQDRALSNLESIPGGY 
NALRRMYTDIQEPMLSAA (SEQ ID NO:594); GNPFASLVSNTSS (SEQ ID 
NO:595); ENRDPLPNPWA (SEQ ID NO:595); GKILKDQDTLSQHGIHD (SEQ ID 
35 NO:597); GLTVHL VIKT QNRP (SEQ ID NO:59S); SELQSQMQRQLLSNPEMM 
(SEQ ID NO:599); PEISffivO-NNPDl>/tR (SEQ ID NO:600); and/or 
RQLIMANPQMQQLIQRNP (SEQ ID NO:601). Polynucleotides encoding these 
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polypeptides are also encompassed by the invention. 
This gene is expressed primarily in breast. 

Therefore, polynucleotides and- polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s) or cell type(s) present in a 
5 biological sample and for diagnosis of diseases and conditions,, which include, but are 
not limited to, breast cancer. Similarly, polypeptides and antibodies directed to these 
polypeptides are useful in providing immunological probes for differential identification 
of the tissue(s) or cell type(s). For a number of disorders of the above tissues or cells, 
particularly of tumor systems, expression of this gene at significantly higher or lower 

10 levels may be routinely detected in certain tissues (e.g., cancerous and wounded 

tissues) or bodily fluids (e.g., serum, plasma, urine, synovial fluid or spinal fluid) or 
another tissue or cell sample taken from an individual having such a disorder, relative to 
the standard gene expression level, i.e., the expression level in healthy tissue or bodily 
fluid from an individual not having the disorder. 

15 The tissue distribution indicates that polynucleotides and polypeptides 

corresponding to this gene are useful for treatment and diagnosis of some types of 
breast cancer. 

FEATURES OF PROTEIN ENCODED BY GENE NO: 103 
20 The translation product of this gene shares sequence homology with secreted 

serine proteases and lysozyme C precursor, which is thought to be important in 
bacteriolytic function. In specific embodiments, polypeptides of the invention comprise 
the sequence: NLCHVDCQDLLNPNLLAGIHCAKRIVS (SEQ ID NO:602); 
' LDGFEG Y SLS DWLCL AF VES KFN (SEQ ID NO:603); 
25 ' NENADGSFDYGLFQENSHYWCN (SEQ ID NO: 604); and/or 

NLCHVDCQDLLNPNLLAGIHCAKRIVS (SEQ ID NO:605). Polynucleotides 
encoding these polypeptides are also encompassed by the invention. 
This gene is expressed primarily in testes. 

Therefore, polynucleotides and polypeptides of the invention are useful as 
30 reagents for differential identification of the tissue(s) or cell type(s) present in a 

biological sample and for diagnosis of diseases and conditions, which include, but are 
not limited to, infection. Similarly ^ polypeptides and antibodies directed to these . 
polypeptides are useful in providing immunological probes for differential identification 
of the tissue(s) or cell type(s). For a number of disorders of the above tissues or cells, 
35 particularly of the immune system, expression of this gene at significantly higher or 

lower levels may be routinely detected in certain tissues (e.g., cancerous and wounded 
tissues) or bodily fluids (e.g., serum, plasma, urine, synovial fluid or spinal fluid) or 
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another tissue or cell sample taken from an individual having such a disorder, relative to 
the standard gene expression level, i.e., the expression level in healthy tissue or bodily 
fluid from an individual not having the disorder. Preferred epitopes include those 
comprising a sequence shown in SEQ ID NO: 336 as residues: Iie-62 to Phe-70, Asn- 
5 78 to Asn-84. 

The tissue distribution and homology to lysozyme C precursor indicates that 
polynucleotides and polypeptides corresponding to this gene are useful for boosting the 
moncyte -macrophage system and enhance the activity of immunoagents. 

10 FEATURES OF PROTEIN ENCODED BY GENE NO: 104 
This gene is expressed primarily in apoptotic T-celL 

Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s) or cell type(s) present in a 
biological sample and for diagnosis of diseases and conditions, which include, but are 

15 not limited to, immune disorders. Similarly, polypeptides and antibodies directed to 
these polypeptides are useful in providing immunological probes for differential 
identification of the tissue(s) or cell type(s). For a number of disorders of the above 
tissues or cells, particularly of the immune system, expression of this gene at 
significantly higher or lower levels may be routinely detected in certain tissues (e.g., 

20 cancerous and wounded tissues) or bodily fluids (e.g., serum, plasma, urine, synovial 
fluid or spinal fluid) or another tissue or cell sample taken from an individual having 
such a disorder, relative to the standard gene expression level, i.e., the expression level 
in healthy tissue or bodily fluid from an individual not having the disorder. 

The tissue distribution indicates that polynucleotides and polypeptides 

25 corresponding to this gene are useful for treatment and diagnosis of some immune 
disorders. 

FEATURES OF PROTEIN ENCODED BY GENE NO: 105 

The translation product of this gene shares sequence homology with ARI 

30 protein of Drosophila (accession 2058299; EMBL: locus DMARIADNE, accession 

X98309), which is thought to be important in axonal path-finding in the central nervous 
system. In specific embodiments, polypeptides of the invention comprise the sequence 
IREVNEV1QNPAT (SEQ ID NO:606); ITRILLSHFNWD KE KLMERYF 
DGNLEKLFA (SEQ ID NO:607); NTRSS AQDMPCQIC YLNYPNS YF (SEQ ID 

35 NO-.608); TGLECGHKFCMQCWSEYLTTKIMEEGMGQTISCPAHG (SEQ ID 
NO:6 14); CDnjVTDDNTVMRLXTDSKV^ 

CHHWKVQYPDAKPV (SEQ ID NO:609): CDrL\T>DNT\"MRLITDSK 
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VKLKYQHL1TNSFVECNRLLKV/CPAPDCHHVVKV (SEQ ID NO:610); . 
GCNHMVCRNQNCKAEFC^ 

RSRAALQRYL (SEQ ID NO:61 1); FYCNRYMNHMQSLRFEHKLVAQVKQ 
^EEMQQH^lSWEVQFLKK^VDVLCQCRATUVl^ (SEQ ED NO: 612); 
YVFAFYLKKNNQSH 

RYCESR (SEQ ID NO:6l3) Polynucleotides encoding these polypeptides are also 
encompassed by the invention. 

This gene is expressed primarily in adult brain, and to a lesser extent in 
endometrial tumor, melanocytes, and infant brain. 

Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s) or cell type(s) present in a 
biological sample and for diagnosis of diseases and conditions, which include, but are 
not limited to, diseases or injuries involving axonai path development. Similarly, 
polypeptides and antibodies directed to these polypeptides are useful in providing 
immunological probes' for differential identification of the tissue(s) or cell type(s). For 
a number of disorders of the above tissues or cells, particularly of the centra] nervous 
system, expression of this gene at significantly higher or lower levels may be routinely 
detected in certain tissues (e.g.. cancerous and wounded tissues) or bodily fluids (e.g., 
serum, plasma, urine, synovial fluid or spinal fluid) or another tissue or cell sample 
taken from an individual having such a disorder, relative to the standard gene 
expression level, i.e., the expression level in healthy tissue or bodily fluid from an 
individual not having the disorder.. 

The tissue distribution and homology to ARI protein indicates that 
polynucleotides and polypeptides corresponding to this gene are useful for treatment of 
disease states or injuries involving axonai path development, including 
neurodegenerative diseases and nerve injury. 

FEATURES OF PROTEIN ENCODED BY GENE NO: 106 

The translation product of this gene shares sequence homology with cytochrome 
b561 [Sus scrofa] which is thought to be an integral membrane protein of 
neuroendocrine storage vesicles of neurotransmitters and peptide hormones. 

This gene is expressed primarily in frontal cortex and to a lesser extent in 
rhabdomyosarcoma. 

Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s) or cell type(s) present in a 
biological sample and for diagnosis of diseases and conditions, which include, but are 
not limited to, neurological disorders. Similarly, polypeptides and antibodies directed to 
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these polypepcides are useful in providing immunological probes for differential 
identification of the tissue(s) or cell type(s). For a number of disorders of the above 
tissues or cells, particularly of the central nervous system, expression of this gene at 
significantly higher or lower levels may be routinely detected in- certain tissues (e.g., 
5 cancerous and wounded tissues) or bodily fluids (e.g., serum, plasma, urine, synovial 
fluid or spinal fluid) or another tissue or cell sample taken from an individual having 
such a disorder, relative to the standard gene expression level, i.e., the expression level 
in healthy tissue or bodily fluid from an individual not having the disorder. Preferred ~ 
epitopes include those comprising a sequence shown in SEQ ID NO: 339 as residues: 
10 Ser-18 to Pro-24. 

The tissue distribution and homology to cytochrome b56i [Sus scrofa] indicates 
that polynucleotides and polypeptides corresponding to this gene are useful for • 
.treatment and diagnosis of neurological disorders. This gene may also be important in 
regulation of some types of cancers. 

15 

FEATURES OF PROTEIN ENCODED BY GENE HO: 107 

In specific embodiments, polypeptides of the invention comprise the sequence: 
MWGYLFVDAAWNFLGCLICGW (SEQ ID NO:615); MHFISS GN VS AIRS S ILLL 
RXSLSYLGNCLRVSAIFVYFLLFLLLS (SEQ ID NO:6l6); and/or MDQALRGSPSE 
20 GFSTDPSPPQVGRQIPSFPPWRRLVXPKASGCFLEREW 

HAYNSSILGGRGKGIT (SEQ ID NO:617). Polynucleotides encoding these 
polypeptides are also encompassed by the invention. 

This gene is expressed primarily in pancreas tumor and to a lesser extent in 
cerebellum. 

25 Therefore, polynucleotides and polypeptides of the invention are useful as 

reagents, for differential identification of the tissue(s) or cell type(s) present in a 
biological sample and for diagnosis of diseases and conditions, which include, but are 
not limited to, pancreatic tumors. Similarly, polypeptides and antibodies -directed to 
these polypeptides are useful in providing immunological probes for differential 

30 identification of the tissue(s) or cell type(s). For a number of disorders of . the above 
tissues or cells, particularly of the endocrine system, expression of this gene at 
significantly higher or lower levels may be routinely detected in certain tissues (e.g., 
cancerous and wounded tissues) or bodily fluids (e.g., serum, plasma, urine, synovial 
fluid or spinal fluid) or another tissue or cell sample taken from an individual having 

35 such a disorder, relative to the standard gene expression level, i.e., the expression level 
in healthy tissue or bodily fluid from an individual not having the disorder. Preferred 
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epitopes include those comprising a sequence shown in SEQ ID NO: 340 as residues: 
Pro-22 to Phe-33. 

The tissue distribution indicates that polynucleotides and polypeptides 
corresponding to this gene are useful for diagnosis and treatment of pancreatic tumors. 

5 

FEATURES OF PROTEIN ENCODED BY GENE NO: 108 

. This gene maps to chromosome 17 and therefore polynucleotides of the 
invention can be used in linkage analysis as a marker for chromosome 17. In specific 
embodiments, polypeptides of the invention comprise the sequence: 
10 MLPA1j\SCCHFSPPEQAAI^ 

KKFQPMNKIERSILHD WE V AGLTSFS FGEDD 

RRGEEWDPQKAEEKRNXkELAQRQ (SEQ ID NO:6 1 S); EEEAAQQGPVVV 
SPASDYKDKYSHLIGKGAAKDAAHMLQ 

IRAKKRLRQSGE (SEQ ID NO:619); PPRRPAQLPLTPGAGQGAGRDKAAAIRA 

15 HPGAPPLNHLLP (SEQ IDNO:620); AVPQAGGKQVFDLSPLELGYVRGMCVCV 
(SEQ ID NO:621) and/or MLPALASCCHFSPPEQAARLKKLQEQEKQQKVEFRK 
RNIEKEVSDHQDSGQIKKKFQPMNKIERSILHDVV'EVAGLTSFSFGEDDDCRYV 
.... MIFKKEFAPSDEELDSYRRGEEWDPQKAEEKRNXKE^ 

SPASDYKDKYSHLIGKGAAKDAAHMLQANKTYGCXPVANKRDTRSIEE.A^IN^ 

20 IRAKKRLRQSGE (SEQ ID NO:622). Polynucleotides encoding these polypeptides 
are also encompassed by the invention. The translation product of this gene shares 
sequence homology with FSA-I which may play a role as a structural protein 
component of the acrosome. 

This gene is expressed primarily in fetal kidney and sperm. 

25 Therefore, polynucleotides and polypeptides of the invention are useful as 

reagents for differential identification of the tissue(s) or cell type(s) present in a 
biological sample and for diagnosis of diseases and conditions, which include, but are 
not limited to, male reproductive disorders, especially involving acrosomal disfunction. 
Similarly, polypeptides and antibodies directed to these polypeptides are useful in 

30 providing immunological probes for differential identification of the dssue(s) or cell 

type(s). For a number of disorders of the above tissues or cells, particularly of the male 
reproductive system, expression of this gene at significantly higher or lower levels may 
be routinely detected in certain tissues (e.g., cancerous and wounded tissues) or bodily 
fluids (e.g., serum, plasma, urine, synovial fluid or spinal fluid) or another tissue or 

35 cell sample taken from an individual having such a disorder, relative to the standard 

gene expression level, i.e., the expression level in healthy tissue or bodily fluid from an 
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individual not having the disorder. Preferred epitopes include chose comprising a 
sequence shown in SEQ ID NO: 341 as residues: Glu-8 to Asn-35. 

The tissue distribution and homology to FS*V I indicates that polynucleotides 
and polypeptides corresponding to this gene are useful for treatment of infertility due to 
5 acrosomal disfunction of sperm. 

FEATURES OF PROTEIN ENCODED BY GENE NO: 109 

This gene is expressed primarily in pituitary and to a lesser extent in 
epididymus. 

10 Therefore, polynucleotides and polypeptides of the invention are useful as 

reagents for differential identification of the tissue(s) or cell type(s) present in a 
biological sample and for diagnosis of diseases and conditions, which include, but are 
not limited to, male reproductive disorders. Similarly, polypeptides and antibodies 
directed to these polypeptides are useful in providing immunological probes for 

15 differential identification of the tissue(s) or cell type(s). For a number of disorders of 
the above tissues or cells, particularly of the male reproductive system, expression of 
this gene at significantly higher or leiwer levels may be routinely detected in certain 
tissues (e.g., cancerous and wounded tissues) or bodily fluids (e.g., serum, plasma, 
urine, synovial fluid or spinal fluid) or another tissue or cell sample taken from an 

20 individual having such a disorder, relative to the standard gene expression level, i.e., 
the expression level in healthy tissue or bodily fluid from an individual not having the 
disorder. Preferred epitopes include those comprising a sequence shown in SEQ ID 
NO: 342 as residues: Met-1 to Trp-6. 

Because the gene is found in both pituitary and epididymus, this indicates that 

25 polynucleotides and polypeptides corresponding to this gene are useful for treatment 
and diagnosis of male reproductive disorders. This may involve a secreted peptide 
produced in the pituitary targeting the epididymus. 

FEATURES OF PROTEIN ENCODED BY GENE NO: 110 
30 In specific embodiments, polypeptides of the invention comprise the sequence: 

LLCPVLNSGXSWNFPHPSQPEYSFHGFHSTRLWI (SEQ ID NO:623);. and/or 
PSTPWFLFLLGLTCPFSTSHPRWDSIPP (SEQ ID NO:624). Polynucleotides 
. encoding these polypeptides are also encompassed by the invention. 
This gene is expressed primarily in resting T -cells. 
35 Therefore, polynucleotides and polypepudes of the invention are useful as 

reagents for differential identificadon of the tissue(s) or cell type(s) present in a 
biological sample and for diagnosis of diseases and conditions, which include, but are 
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not limited to, T-ceil disorders. Similarly, polypeptides and antibodies directed to these 
polypeptides are usefui in providing immunological probes for differential identification 
of the tissue(s) or cell type(s). For a number of disorders of the above tissues or cells, 
particularly of the immune system, expression of this gene at significantly higher or 
lower levels may be routinely detected in certain tissues (e.g., cancerous and wounded 
tissues) or bodily fluids (e.g., serum, plasma, urine, synovial fluid or spinal fluid) or 
another tissue or cell sample taken from an individual having such a disorder, relative to 
the standard gene expression level, i.e.. the expression level in healthy tissue or bodily 
fluid from an individual not having the disorder. 

The tissue distribution indicates that polynucleotides and polypeptides 
corresponding to this gene are useful for treatment and diagnosis of certain immune 
disorders, especially those involving T-cells. 

FEATURES OF PROTEIN ENCODED BY GENE NO: 111 

This gene is expressed primarily in cerebellum and whole brain and to a lesser 
extent in infant brain and fetal kidney. 

Therefore, polynucleotides ^nd polypeptides of the invention are usefui as 
reagents for differential identification of the tissue(s) or cell type(s) present in a 
biological sample and for diagnosis of diseases and conditions, which include, but are 
not limited to, neurological disorders. Similarly, polypeptides and antibodies directed to 
these polypeptides are useful in providing immunological probes for differential 
identification of the tissue(s) or ceil type(s). For a number of disorders of the above 
tissues or cells, particularly of the central nervous system, expression of this gene at 
significantly higher or lower levels may be routinely detected in certain tissues (e.g., 
cancerous and wounded tissues) or bodily fluids (e.g., serum, plasma, urine, synovial 
fluid or spinal fluid) or another tissue or ceil sample taken from an individual having 
such a disorder, relative to the standard gene expression level, i.e., the expression level 
in healthy tissue or bodily fluid from an individual not having the disorder. Preferred 
epitopes include those comprising a sequence shown in SEQ ID NO: 344 as residues: 
Asp-48 to GIy-55. 

The tissue distribution indicates that polynucleotides and polypeptides 
corresponding to this gene are useful for diagnosis and treatment of neurological 
disorders. 

FEATURES OF PROTEIN ENCODED BY GENE NO: 112 

The translation product of this gene shares sequence homology with yeast 
mitochondrial ribosomal protein homologous to ribosomal protein s!5 of E.coti- which 
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is thought to be important in the early assembly of ribosomes (See Accession No. 
M3S016). This gene maps to chromosome I, and therefore, may be used as a marker 
in linkage analysis for chromosome 1 . 

This gene is expressed primarily in developmental tissues. 
5 Therefore, polynucleotides and polypeptides of the invention are useful as 

reagents for differential identification of the tissue(s) or cell type(s) present in a 
biological sampie and for diagnosis of diseases and conditions, which, include, but are 
not limited to, development of cancers and tumors in addition to healing wounds. 
Similarly, polypeptides and antibodies directed to these polypeptides are useful in 

10 providing immunological probes for differential identification of the tissue(s) or cell 
type(s). For a number of disorders of the above tissues or cells, particularly of the 
immune and developmental expression of this gene at significantly higher or lower 
levels may be routinely detected in certain tissues (e.g., cancerous and wounded 
tissues) or bodily fluids (e.g., serum, plasma, urine, synovial fluid or spinal fluid) or 

15 another tissue or cell sample taken from an individual having such a disorder, relative to 
the standard gene expression level, i.e., the expression level in healthy tissue or bodily 
fluid from an individual not having the disorder. 

The tissue distribution and homology to ribosomalprotetn s 1 5 of E. coli 
indicates that polynucleotides and polypeptides corresponding to this gene are useful for 

20 diseases related to the assembly of ribosomes in the mitochondria which is important in 
the translation of RNA into protein. Therefore, this indicates that polynucleotides and 
polypeptides corresponding to this gene are useful for diagnosis and intervention of 
multiple tumors as well as in healing wounds which are thought to be under similar 
regulation as developmental tissues. Protein, as well as, antibodies directed against the 

25 protein have utility as tumor markers, in addition to immunotherapy targets, for the 
above listed tumors and tissues. 



FEATURES OF PROTEIN ENCODED BY GENE NO: 113 

The translation product of this gene shares sequence homology with human 
30 poliovirus receptor precursors which are thought to be important in viral binding and 
uptake. Preferred polypeptide fragments comprise the following amino acid sequence: 
ELSISISNVALADEGEYTCSIFTM^ 

ATLiNrQSSGSKPAARLWRKGDQELHGEPTRIQEDPNGKTFTVSSSVTFQVTR 
EDDGASIVCSVNHESLKGADRSTSQRIEVLYTPTAMIRPDPPHPREGQKLLLHC 
35 EGRGNPVPQQYLWEKEGSWPLKNITQESALIFPFLNKSDSGTYGCTATSNMGS 
YKAYYTLNVND (SEQ ED NO:625). Also preferred are polynucleotide fragments 
encoding these polypeptide fragments (See Accession No. gnllPIDId 1002627). 
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This gene is expressed almost exclusively in human brain tissue. 
Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s) or cell type(s) present in a 
biological sample and for diagnosis of diseases and conditions, which include, but are 
5 not limited to, susceptibility to viral disease and diseases of the CNS especially cancers 
of that system. Similarly, polypeptides and antibodies directed to these polypeptides are 
useful in providing immunological probes for differential identification of the tissue(s) 
or cell type(s). For a number of disorders of the above tissues or cells, particularly of 
the central nervous system, expression of this gene at significantly higher or lower 

10 levels may be routinely detected in certain tissues (e.g., cancerous and wounded 

tissues) or bodily fluids (e.g., serum, plasma, urine, synovial fluid or spinal fluid) or 
another tissue or cell sample taken from an individual* having such a disorder, relative to 
the standard gene expression level, i.e., the expression level in healthy tissue or bodily 
fluid from an individual not having the disorder. Preferred epitopes include those 

15 comprising a sequence shown in SEQ ID NO: 346 as residues:Leu-26 to Asp-37, Lys- 
53 to Ser-59. 

The tissue distribution and hpmology to polio virus receptor precursors indicates 
that polynucleotides and polypeptides corresponding to this gene are useful for the 
treatment and prevention of diseases that involve the binding and uptake of virus 
20 particles for infection. It might also be helpful in genetic therapy where the goal is to 
insert foreign DNA into infected cells. With the help of this protein, the binding and 
uptake of this foreign DNA might be aided. In addition, it is expected that over 
expression of this gene will indicate abnormalities involving the CNS, particularly 
cancers of that system. 

25- 

FEATURES OF PROTEIN ENCODED BY GENE NO: 114 

The translation product of this gene shares sequence homology with 
Y087_CAEEL hypothetical 28.5 KD protein ZK1236.7 in chromosome III of 
• Caenorhabditis elegans in addition to alpha- 1 collagen type III (See Accession No. 
30 gil537432). One embodiment for this gene is the polypeptide fragment(s) comprising 
the following amino acid sequence: VPELPDRVHQLHQAVQGCALGRPGFPGGPTH 
SGHHKSHPGPAGGDYNRCDRPGQVHLHNPRGTGRRGQLHPTAGPGVHRRA 
CPSQQLPHRLGPGVPCPSPSLTPVLPSWTQSWCG LPGYTSSS (SEQ ID 
NO:630). An additional embodiment is the polynucleotide fragment(s) encoding these 
35 polypeptide fragments 

This gene is expressed primarily in brain cells and to a lesser extent in activated 
B and T cells/ 
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Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s) or cell type(s) present in a 
biological sample and for diagnosis of diseases and conditions, which include, but are 
not limited to, neurodegeneration and imunological disorders. Similarly, polypeptides 
5 and antibodies directed to these polypeptides are useful in providing immunological 
probes for differential identification of the tissue(s) or cell type(s). For a number of 
disorders of the above tissues or cells, particularly of the neural and immune systems, 
expression of this, gene at significantly higher or lower levels may be routinely detected 
in certain tissues (e.g., cancerous and wounded tissues) or bodily fluids (e.g., serum, 

10 plasma, urine, synovial fluid or spinal fluid) or another tissue or cell sample taken from 
an individual having such a disorder, relative to the standard gene expression level, i.e., 
the expression level in healthy tissue or bodily fluid from an individual not having the 
disorder. Preferred epitopes include those comprising a sequence shown in SEQ ID 
NO: 347 as residues: Glu-34 to Glu-39, Gly-5 1 to Ser-72, Ala-88 to Glu-93, Gin- 100 

15 toVal-105. 

The tissue distribution and homology to YOS7_CAEEL hypothetical 28.5 KD 
protein ZK1236.7 in chromosomeTII of Caenorhabditis elegans as well as to a 
conserved alpha- 1 collagen type III protein indicates that polynucleotides and 
polypeptides corresponding to this gene are useful for the detection and treatment of 

20 neurodegenerative disease states and behavioral disorders such as Alzheimer's Disease, 
Parkinson's Disease, Huntingtons' Disease, schizophrenia, mania, dementia, paranoia, 
obsessive compulsive disorder and panic disorders. Because the gene is expressed in 
cells of lymphoid origin, the natural gene'product may be involved in immune 
functions. Therefore it may be also used as an agent for immunological disorders 

25 including arthritis, asthma, immune deficiency diseases such as AIDS, and leukemia. 

FEATURES OF PROTEIN ENCODED BY GENE NO: 115 

The translation product of this gene shares sequence homology with alpha 3 
type DC collagen which is thought to be important in hyaline cartilage formation via its 

30 ability to uptake inorganic sulfate by ceils (See Accession No. gil975657). One 

embodiment of this gene is the polypeptide fragment comprising the following amino 
acid sequence: SLRRPRS A AXQTLTTFLS S V S S ASS S ALPGSREPCDPRAPP PPR 
SGSAASCCSCCGSCPRRR\PLRSPRGSKRRIRQREVVDLYNGMCLQGPAGVPG 
RDGSPGANGIPGTPGIPGRDGFKGEKGECLRESFEESWTPNYKQCSWSSLNY 

35 GIDLGK1AECTFTKMRSNSALRVLFS 
•LPIEAHYLDQGSPEMNSTINTHRT^ 

DASTGWNSVSRinEELPK (SEQ. ID NO:634). An addidonni embodiment ire the' 
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polynucleotide fragments encoding this polypeptide fragment. 

This gene is expressed primarily in smooth muscle and co a lesser extent in 
synovial tissue. 

Therefore, polynucleotides and polypeptides of the invention are useful as 
5 reagents for differential identification of the tissue(s) or cell type(s) present in a 

biological sample and for diagnosis of diseases and conditions, which include, but are 
not limited to, dwarfism, spinal deformation, and specific joint abnormalities as well as 
chondrodysplasias, i.e., spondyloepiphyseal dysplasia congenita, familial osteoarthritis, 
Atelosteogenesis type II, metaphyseal chondrodysplasia type Schmid and autoimmune 

10 disorders . Similarly, polypeptides and antibodies directed to these polypeptides are 

useful in providing immunological probes for differential identification of thetissue(s) 
or cell type(s). For a number of disorders of the above tissues or cells, particularly of 
the skeletal system, expression of this gene at significantly higher or lower levels may 
be routinely detected in certain tissues (e.g., cancerous and wounded tissues) or bodily 

15 fluids (e.g., serum, plasma, urine, synovial fluid or spinal fluid) or another tissue or 
cell sample taken from an individual having such a disorder, relative to the standard 
gene expression level, i.e., the expression level in healthy tissue or bodily fluid from an 
individual not having the disorder. 

The tissue distribution and homology to alpha 3 type IX collagen indicates that 

20 polynucleotides and polypeptides corresponding to this gene are useful for the treatment 
and diagnosis of diseases associated with the mutation in this gene which leads to the 
many different types of chondrodysplasias. By the use of this product, the abnormal 
growth and development of bones of the limbs and spine could be routinely detected or 
treated in utero since the protein or muteins thereof could affect epithelial cells early in 

25 development and later the chondrocytes of the developing craniofacial structure. 

FEATURES OF PROTEIN ENCODED BY GENE NO: 116 

The translation product of this gene shares sequence homology with retrovirus- 

related reverse transcriptase which is thought to be important in viral replication. One 
30 embodiment for this gene is the polypeptide fragments comprising the following amino 

acid sequence: TKKENCRPASLiVINIDTKILNKILMNQ (SEQ ED NO:640). An 

additional embodiment is the polynucleotide fragments encoding these polypeptide 

fragments (See Accession No. pir!A25313!GNHULl). 

This gene is expressed primarily in human meningima. 
35 Therefore, polynucleotides and polypeptides of the invention are useful as 

reagents for differential identification of the tissue(s) or cell type(s) present in a 

biological sample and for diagnosis of diseases and conditions, which include, but are 
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not limited to, retroviral diseases such as AIDS, and- possibly certain cancers due to 
transactivation of latent cell division genes. Similarly, polypeptides and antibodies 
directed to these polypeptides are useful in providing immunological probes for 
differential identification of the tissue(s) or cell type(s). For a number of disorders of 
5 ' the above tissues or cells, particularly of the immune system, expression of this gene at 
significantly higher or lower levels may be routinely detected in ceaain tissues (e.g., 
cancerous and wounded tissues) or bodily fluids (e.g., serum, plasma, urine, synovial 
fluid or spinal fluid) or another tissue or cell sample taken from an individual having 
such a disorder, relative to the standard gene expression level, i.e., the expression level 

10 in healthy tissue or bodily fluid from an individual not having the disorder. 

The tissue distribution and homology to retrovirus-related reverse transcriptase 
indicates that polynucleotides and polypeptides corresponding to this gene are useful for 
the detection and treatment of diseases and maladies associated with retroviral infection 
since a functional reverse transcriptase (RT) orRT-like molecule is an integral 

1 5 component of the retroviral life cycle. 

FEATURES OF PROTEIN ENCODED BY GENE NO: 117 

The translation product of this gene shares sequence homology with an 
unknown gene from C. elegans, as well as weak homolog with mammalian metaxin, a. 
20 gene contiguous to both thrombospondin 3 and glucocerebrosidase, is known to be 
. required for embryonic development. Preferred polypeptide fragments comprise the 
following amino acid sequence: MCNLPIKWCRANAEYMSPSGKVPXXHVGNQ 
VVSELGPIVQFVKAKGHSLSDGLEEVQ 

DEATVGXITHXRYGSPYPWPLXHILAYQKQWEVKRKXICAIGWGKKTLDQVLE 
25 DVDQCCQALSQRLGTQPYFFNKQPTELDALVrc 

YSNLLAFCRR1 EQHYFEDRGKGRLS (SEQ ID NO:64i); MCNLPIKVVCRANAE 
" YMSPSGKVPXXHVGNQVVSELGPIVQFVK (SEQ ID NO:642) ? . Also preferred are 
polynucleotide fragments encoding these polypeptide fragments (See Accession No. 
gill 326 108). 

30 This gene is expressed primarily in fetal tissues and to a lesser extent in 

hematopoietic cells and tissues, including spleen, monocytes, and T cells. 

Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s) or cell type(s) present in a 
biological sample and for diagnosis of diseases and conditions, which include, but are 

35 not limited to, cancer; lymphoproliferative disorders; inflammation: chondrosarcoma, 
and Gaucher disease. Similarly, polypeptides and antibodies directed to these 
polypeptides are useful in. providing immunological probes for differential identification 
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of the tissue(s) or cell type(s). For a number of disorders of the above (issues or cells, 
particularly of che hematopoietic and embryonic systems, expression of this gene at 
significantly higher or lower levels may be routinely detected in certain tissues (e.g., 
cancerous and wounded tissues) or bodily fluids (e.g., serum, plasma, urine, synovial 

5 fluid or spinal fluid) or another tissue or cell sample taken from an individual having 
such a disorder, relative to the standard gene expression level, i.e., the expression level 
in healthy tissue or bodily fluid from an individual not having the disorder. 

The tissue distribution indicates that polynucleotides and polypeptides 
corresponding to this gene are useful for the diagnosis and treatment of cancer and other 

10 proliferative disorders: Expression in embryonic tissue and other cellular sources 

marked by proliferating cells indicates that this protein may play a role in the regulation 
. or cellular division. Additionally, the expression in hematopoietic cells and tissues 
indicates that this protein may play a role in the proliferation, differentiation, and 
survival of hematopoietic cell lineages. Thus, this gene may be usefuKin the treatment 

15 " of lymphoproliferacive- disorders, and in the maintenance and differentiation of various 
hematopoietic lineages from early hematopoietic stem and committed progenitor cells. 

FEATURES OF PROTEIN ENCODED BY GENE NO: US 

The translation product of this gene shares sequence homology with reverse 
20 transcriptase which is important in the synthesis of a cDNA chain from an RNA 
molecule, and is a method whereby the infecting RNA chains of retroviruses are 
transcribed into their DNA complements. One embodiment for this gene is the 
polypeptide fragment comprising the following amino acid sequence: 
iVLXXXNSHITICT 

25 KIKGWRKIYQANGKQKK (SEQ ID NO:647). An additional embodiment is the 
polynucleotide fragments comprising polynucleotides encoding these polypeptide 
fragments (See Accession No. gil2072964). 

. This gene is expressed primarily in skin and to a lesser extent in neutrophils. 
Therefore, polynucleotides .and polypeptides of the invention are useful as 

30 reagents for differential identification of the tissue(s) or cell type(s) present in a 

biological sample and for diagnosis of diseases and conditions which include, but are 
not limited to, cancer, hematopoietic disorders; inflammation; disorders of immune 
surveillance. Similarly, polypeptides and antibodies directed to these polypeptides are 
useful in providing immunological probes for differential identification of the tissue(s) 

35 or cell type(s). For a number of disorders of the above tissues or cells, particularly of 
the epidermis and/or hematopoietic system, expression of this gene at significandy 
higher or Lower levels may be routinely detected in certain tissues (e.g., cancerous and 
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wounded tissues) or bodily fluids (e.g., serum, plasma, urine, synovial fluid or spinal 
fluid) or another tissue or cell sample taken from an individual having such a disorder, 
relative to the standard gene expression level, i.e.; the expression level in healthy tissue 
or bodily fluid from an individual not having the disorder. 
5 The tissue distribution and homology to reverse transcriptase indicates that 

polynucleotides and polypeptides corresponding to this gene are useful for cancer 
therapy. Expression in the skin also indicates that this gene is useful in wound healins 
and fibrosis. Expression by neutrophils also indicates that this gene product plays a role 
in inflammation and the control of immune surveillance (i.e. recognition of viral 
10 pathogens). Reverse transcriptase family members are also useful in the detection and 
treatment of AIDS. 

FEATURES OF PROTEIN ENCODED BY GENE NO: 119 

The translation product of this gene shares sequence homology with reverse 
15 transcriptase which is important in the synthesis of a cDNA copy of an RNA molecule, 
and is a method whereby a retrovirus reverse-transcribes its genome into an inheritable 
DNA copy. 

This gene is expressed primarily in the frontal cortex of brain. 

Therefore, polynucleotides and polypeptides of the invention are useful as - 

20 reagents for differential identification of the tissue(s) or cell type(s) present in a 

biological sample and for diagnosis of diseases and conditions, which include, but are 
not limited to, cancer and neurodegenerative disorders. Similarly, polypeptides and 
antibodies directed to these polypeptides are useful in providing immunological probes 
for differential identification of the tissue(s) or cell type(s). For a number of disorders 

25 of the above tissues or cells, particularly of the CNS and peripheral nervous system, 

expression of this gene at significantly higher or lower levels may be routinely detected 
in certain tissues (e.g., cancerous and wounded tissues) or bodily fluids (e.g., serum, 
plasma, urine, synovial fluid or spinal fluid) or another tissue or cell sample taken from 
an individual having such a disorder, relative to the standard gene expression level, i.e., 

30 the expression level in healthy tissue or bodily, fluid from an individual not having the 
disorder. 

The tissue distribution -and homology to reverse transcriptase suggest that this is 
useful in the treatment of cancer and AIDS. The expression in brain indicates that it 
plays a role in neurodegenerative disorders and in neural degeneration. 
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FEATURES OF PROTEIN ENCODED BY GENE NO: 120 

One embodimenc of this gene has homology to a hypothetical protein in 
Schizosaccharomyces pombe (See Accession No. 2281980). Another embodiment for 
this gene is the polypeptide fragments comprising the following amino acid sequence: 
5 IYHLHSWIFFHFKRAFCMCnTMKVIHAHCSKLRKCX 

• (SEQ ID NO:65l). An additional embodiment is the polynucleotide fragments 
encoding these polypeptide fragments. This gene maps to chromosome 18, and 
therefore, may be used as a marker in linkage analysis for chromosome 18. 

This gene is expressed primarily in adult hypothalamus and to a lesser extent in 
10 infant brain. 

Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s) or cell type(s) present* in a 
biological sample and' for diagnosis of diseases and conditions, which include, but are 
not limited to, neurodegenerative disorders; endocrine function; and vertigo. Similarly, 

15 polypeptides and antibodies directed to these polypeptides are useful in providing 

immunological probes for differential identification of the tissue(s) or cell type(s).- For 
a number of disorders of the above tissues or cells, particularly of the brain, C-NS and 
peripheral nervous system, expression of this gene at significantly higher or lower 
levels may be routinely detected in certain tissues (e.g., cancerous and wounded 

20 tissues) or bodily fluids (e.g., serum, plasma, urine, synovial fluid or spinal fluid) or 
another tissue or cell sample taken from an individual having such a disorder, relative to 
the standard gene expression level, i.e., the expression level in healthy tissue or bodily 
fluid from an individual not having the disorder. 

The tissue distribution indicates that polynucleotides and polypeptides 

25 corresponding to this gene are useful for the treatment and diagnosis of 

neurodegenerative disorders: diagnosis of tumors of a brain or neuronal origin; 
treatments involving hormonal control of the entire body and of homeostasis, 
behavioral disorders, such as Alzheimer's Disease, Parkinson's Disease, Huntington's 
Disease, schizophrenia, mania, dementia, paranoia, obsessive compulsive disorder and 

30 panic disorder..In addition, the gene or gene product may also play a role in the 

treatment, and/or detection of developmental disorders associated with the developing 
embryo. 

FEATURES OF PROTEIN ENCODED BY GENE NO: 121 

35 The translation product of this gene shares sequence homology with the human 

IRLB protein which is thought to be important in binding to a c-myc promoter element 
and thus regulating its transcription (See Accession No. gil33969). This gene maps to 



WO 98/54963 



PCT/US98/U422 



94 

chromosome 1 , and therefore, may be used as a marker in linkage analysis for 
chromosome i. 

This gene is expressed primarily in brain and breast and to a lesser extent in a 
variety of hematopoietic tissues and cells. 
5 Therefore, polynucleotides and polypeptides of the invention are useful as 

reagents for differential identification of the tissue(s) or cell type(s) present in a 
biological sample and for diagnosis of diseases and conditions, which include, but are 
not limited to, cancer of the brain, and breast; lymphoproliferative disorders; 
neurodegenerative diseases. Similarly, polypeptides and antibodies directed to these 

10 polypeptides are useful in providing immunological probes for differential identification 
of the tissue(s) or cell type(s). For a number of disorders of the above tissues or cells, 
particularly of the CNS, breast, and immune system, expression of this gene at 
significantly higher or lower levels may be routinely detected in certain tissues (e.g., 
cancerous and wounded tissues) or bodily fluids (e.g., serum, plasma, urine, synovial 

15 fluid or spinal fluid) or another tissue or cell sample taken from an individual havins 

such a disorder, relative to the standard gene expression level, i.e., the expression level 
in healthy tissue or bodily fluid from an individual nor having the disorder. 

The tissue distribution indicates that polynucleotides and polypeptides 
corresponding to this gene are useful for the treatment and diagnosis of cancer of the 

20 brain, breast, and hematopoietic system. In addition, it may be useful for the treatment 
of neurodegenerative disorders, as well as disorders of the hematopoietic. system, 
including defects in immune competency and inflammation. Protein, as well as, 
antibodies directed against the protein may show utility as a tumor marker and 
immunotherapy targets for the above listed tumors and tissues. 

25 

FEATURES OF PROTEIN ENCODED BY GENE NO: 122 

The translation product of this gene shares sequence homology with an ATP 
synthase, a key component of the proton channel that is thought to be important in the 
translocation of protons across the membrane. 

30 This gene is expressed primarily in T cell lymphoma. 

Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s) or cell type(s) present in a 
biological sample and. for diagnosis of diseases and conditions, which include, but are 
not limited to, T cell lymphoma. Similarly, polypeptides and- antibodies directed to these 

35 polypeptides are useful in providing immunological probes for differential identification 
of the tissue(s) or cell type(s). For a number of disorders of the above dssues or cells, 
particularly of the immune system, expression of this gene at significantly higher or " 



WO 9S/54963 



PCTAJS98/11422 



lower levels may be roucinely detected in certain tissues (e.g., cancerous and wounded 
tissues) or bodily fluids (e.g., serum, plasma, urine, synovial fluid or spinal fluid) or 
another tissue or cell sample taken from an individual having such a disorder, relative to 
the standard gene expression level, i.e., the expression level in healthy tissue or bodily 
5 fluid from an individual not having the disorder. 

The tissue distribution and homology to ATP synthase indicates that 
polynucleotides and polypeptides corresponding to this gene are useful for the treatment 
of defects in proton transport, homeostasis, and metabolism, as well as the diagnosis 
and treatment of lymphoma. Because the gene is expressed in cells of lymphoid origin, 
10 the natural gene product may be involved in immune functions. Therefore it may be also 
used as an agent for immunological disorders including arthritis, asthma, immune 
deficiency diseases such' as AIDS, and leukemia 

FEATURES OF PROTEIN ENCODED BY GENE NO: 123 

15 This gene maps to chromosome 15, and therefore, may be used as a marker in 

linkage analysis for chromosome 15. 

This gene is expressed primarily in a variety of fetal tissues, including fetal 
liver, lung, and spleen, and to a lesser extent in a variety of blood cells, including 
eosinophils and T cells. 

20 Therefore, polynucleotides and polypeptides of the invention are useful as 

reagents for differential identification of the tissue(s) or cell type(s) present in a 
biological sample and for diagnosis of diseases and conditions, which include, but are 
not limited to, cancer (abnormal cell proliferation); T cell lymphomas; and hematopoietic 
disorders. Similarly, polypeptides and antibodies directed to these polypeptides are 

25 useful in providing immunological probes for differential identification of the tissue(s) 
or cell type(s). For a number of disorders of the above tissues or cells, particularly of 
the fetus and immune system, expression of this gene at. significantly higher or lower 
levels may be routinely detected in certain tissues (e.g., cancerous and wounded 
tissues) or bodily fluids (e.g., serum, plasma, urine, synovial fluid or spinal fluid) or 

30 another tissue or cell sample taken from an individual having such a disorder, relative to 
the standard gene expression level, i.e., the expression level in healthy tissue or bodily 
fluid from an individual not having the disorder. 

The tissue distribution indicates that polynucleotides and polypeptides 
corresponding to this gene are useful for the treatment and diagnosis of conditions 

35 involving cell proliferation. Expression of this gene in fetal tissues, as well as in a 
variety of blood cell lineages indicates that it may piay a role in either cellular 
proliferation; apoptosis; or cell survival. Thus it may be useful in the management and 
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treatment of a variety of cancers and malignancies, tn addition, its expression in blood 
cells suggest that it may play additional roles in hematopoietic disorders and conditions, 
andcould.be useful in treating diseases involving autoimmunity, immune modulation, 
immune surveillance, and inflammation.. 

5 

FEATURES OF PROTEIN ENCODED BY GENE NO: 124 

This gene is expressed primarily in placenta and to a lesser extent in pineal gland 
and rhabdomyosarcoma. 

Therefore, poiynucleotides and polypeptides of the invention are useful as 

10 reagents for differential identification of the dssue(s) or cell type(s) present in a 

biological sample and for diagnosis of diseases and conditions, which include, but are 
not limited to, developmental, endocrine, and female reproductive disorders. Similarly, 
polypeptides and antibodies directed to these polypeptides are useful in providing 
immunological probes for differential identification of the tissue(s) or cell type(s). For 

15 a number of disorders of the above tissues or cells, particularly of the [insert system 
where a related disease state is likely, e.g., immune], expression of this gene at 
significantly higher or lower levels may be routinely detected in certain tissues (e.g., 
cancerous and wounded tissues) or bodilyTluids (e.g., serum, plasma, urine, synovial 
fluid or spinal fluid) or another tissue or cell sample taken from an individual having 

20 such a disorder, relative to the standard gene expression level, i.e., the expression level 
in healthy tissue or bodily fluid from an individual not having the disorder. Preferred 
epitopes include those comprising a sequence shown in SEQ ID NO: 357 as residues: 
Leu-69 to Val-76. 

The tissue distribution indicates that polynucleotides and polypeptides 

25 corresponding to this gene are useful for diagnosis and treatment of disorders in 

development. Protein, as well as, antibodies directed against the protein may show 
utility as a tumor marker and immunotherapy targets for the above listed tumors and 
tissues. 

30 FEATURES OF PROTEIN ENCODED BY GENE NO: 125 
This gene is expressed primarily in benign prostatic hyperplasia. 
Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s) or cell type(s) present in a 
biological sample and for diagnosis of benign prostatic hyperplasia. Similarly, 
35 polypeptides and andbodies directed to these polypeptides are useful in providing 

immunological probes for differential identification of the tissue(s) or cell type(s). For 
a number of disorders of the above tissues or cells, particularly of the reproductive 



WO 98/54963 



PCT/US98/11422 



system, expression of this gene at significantly higher or lower levels may be routinely 
detected in certain tissues (e.g., cancerous and wounded tissues) or bodily fluids (e.g./ 
serum, plasma, urine, synovial fluid or spinal fluid) or another tissue or cell sample 
taken from an individual having such a disorder, relative to the standard gene 

5 expression level, i.e., the expression level in healthy tissue or bodily fluid from an 
individual not having the disorder. 

The tissue distribution indicates that polynucleotides and polypeptides 
corresponding to this gene are useful for treatment and diagnosis of benign prostatic 
hyperplasia. Protein, as well as, .antibodies directed against the protein may show utility 

10 as a tumor marker and immunotherapy targets for the above listed tumors and tissues. 

FEATURES OF PROTEIN ENCODED BY GENE NO: 126 

This gene is expressed primarily in apoptotic T-cells and to a lesser extent in 
suppressor T cells and ulcerative colitis. 

15 Therefore, polynucleotides and polypeptides of the invention are useful as 

" reagents for differential identification of the tissue(s) or cell type(s) present in a 
biological sample and for diagnosis of diseases and conditions, which include, but are 
not limited^to, .diseases involving premature apoptosis, and immunological and 
gastrointestinal disorders. Similarly, polypeptides and antibodies. directed to these 

20 polypeptides are useful in providing immunological probes for differential identification 
of the tissue(s) or cell type(s). For a number of disorders of the above tissues or cells, 
particularly of the immune system, expression of this gene at significantly higher or 
lower levels may be routinely detected in certain tissues (e.g., cancerous and wounded 
tissues) or bpdily fluids (e.g., serum, plasma, urine, synovial fluid or spinal fluid) or 

25 another tissue or cell sample taken from an individual having such a disorder, relative to 
the standard gene expression level, i.e., the expression level in healthy tissue or bodily 
fluid from an individual not having the disorder. 

The tissue distribution indicates that polynucleotides and polypeptides 
corresponding to this gene are useful for treatment and diagnosis of disorders involving 

30 inappropriate levels of apoptosis, especially in immune cell lineages. Because the gene 
is expressed in cells of lymphoid origin, the natural gene product may be involved in 
immune functions. Therefore it may be also used as an agent for immunological 
disorders including arthritis, asthma,' immune deficiency diseases (such as AIDS), and 
leukemia. 



35 
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FEATURES OF PROTEIN ENCODED BY GENE NO: 127 

This gene is expressed primarily in Raji cells. 

Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s) or cell type(s) present in a 
5 biological sample and for diagnosis of diseases and conditions, which include, but are 
not limited to, inflammation and T cell autoimmune disorders. Similarly /polypeptides 
and antibodies directed to these polypeptides are useful in providing immunological 
probes for differential identification of the dssue(s) or cell type(s). For a number of 
disorders of the above tissues or cells, particularly of the immune system, expression of 

10 this gene at significantly higher or lower level's may be routinely detected in certain 
tissues (e.g., cancerous and wounded tissues) or bodily fluids (e.g., serum, plasma, 
urine, synovial fluid or spinal fluid) or another tissue or cell sample taken from an 
individual having such a disorder, relative to the standard gene expression level, i.e., 
the expression level in healthy tissue or bodily fluid from an individual not having the 

15 disorder. Preferred epitopes include those comprising a sequence shown in SEQ ID 
NO: 360-as residues: Asp-23 to GIy-29. 

The tissue distribution indicates that polynucleotides and polypeptides 
corresponding to this gene are useful for treatment and diagnosis of inflammation and T 
cell autoimmune disorders. Because the gene is expressed in cells of lymphoid origin, 

20 the natural gene product may be involved in immune functions. Therefore it may be also 
used as an agent for immunological disorders including arthritis, asthma, immune 
deficiency diseases (such as AIDS), and leukemia. 

FEATURES OF PROTEIN ENCODED BY GENE NO: 128 

25 The translation product of this gene shares sequence homology with an C. 

elegans coding region C47D12.2 of unknown function (See Accession No. 

gnl!PIDIe348986). One embodiment for this gene is the polypepdde fragments 

comprising the following amino acid sequence: EDDGFNRSIHEVILKNITWY 

SERVLTEISLGSLLILVVIRTI^ 
30 AAQRI1SLFSLLSKKHNKVLEQATQSLR 

LEIINSCLTNSLHHNPNTVALLYKRDLre^ 

QAGS (SEQ ID NO:657); EDDGFNRS CHE V1L KNITW Y' S ER VLTEIS LGS LLIL V V 
(SEQ ID NO:65S); RTIQYNMTRTRDKY"LHTNCLA.AL.\NMS AQFRSLHQYAAQ 
RIIS LFSLLS KKHN (SEQ ID NO:659); KKHNKVLEQATQSLRGSLSSNDVPLPDY* 
35 AQD (SEQ ID NO:66I); SCLTNSLHHNPNTVYALLYKRDLFEQFRTHPSFQD 

EV1QNIDLV1SFFSSRLLQAGS (SEQ ID NO:660). An additional embodiment is the 
polynucleotide fragments encoding these polypepdde fragments. This gene maps to' 
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chromosome 18, and therefore, may be used as a marker in linkage analysis for 
chromosome 18. 

This gene is expressed primarily in smooth muscle and to a lesser extent in fetai 

liver. 

5 Therefore, polynucleotides and polypeptides of the invention are useful as 

reagents for differential identification of the tissue(s) or cell type(s) present in a 
biological sample and for diagnosis of diseases and conditions, which include, but are 
not limited to, atherosclerosis and other cardiovascular and hepatic disorders. Similarlv, 
polypeptides and antibodies directed to these polypeptides are useful in providing 

10 immunological probes for differential identification of the tissue(s) or celJ type(s). For 
a number of disorders of the above tissues or cells,- particularly of the circulatory 
system, expression of this gene at significantly higher or lower levels may be routinely 
detected in certain tissues (e.g., cancerous and wounded tissues) or bodily fluids (e.g., 
serum, plasma, urine, synovial fluid or spinal fluid) or another tissue or cell sample 

15 taken from an individual having such a disorder, relative to the standard gene 

expression level, i.e., the expression level in healthy, tissue or bodily fluid from- an 
individual not having the disorder. . 

The tissue distribution indicates that polynucleotides and polypeptides 
corresponding to this gene are useful for diagnosis and treatment of circulatory system 

20 disorders such as atherosclerosis, hypertension, and thrombosis . In addition, the tissue 
distribution indicates that polynucleotides and polypeptides corresponding to this gene 
are useful for the detection and treatment of liver disorders and cancers (e.g. 
hepatoblastoma, jaundice, hepatitis, liver metabolic diseases and conditions that are 
attributable to the differentiation of hepatocyte progenitor cells). In addition the 

25 expression in fetus would suggest a useful role for the protein product in developmental 
abnormalities, fetal deficiencies, pre-natal disorders and various would-heaiing models 
and/or tissue trauma. 

FEATURES OF PROTEIN ENCODED BY GENE NO: 129 
30 The translation product of this gene shares sequence homology with a ribosomal 

protein which is thought to be important in cellular metabolism, in addition to the 
C.elegans protein F40F1 1.1 which does not have'a known function at the current time 
(See Accession No. gnllPIDIe244552 ). Preferred polypeptide fragments comprise the 
following amino acid sequence: 
35 MADIQTERAYQKQPTIFQNKKRVLLGETGK^ 

PRRLLRGTYIDKKCPFTGNVSIRGRrLSGVVTQDEDAEDHCHPPRLSALHPQVQ 
PLREAPQEHVCTPVPL LQGRPDR (SEQ ID NO:662); MKMQ RTIVIRRD YLU - 
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YIRKYNRFEKRHKNMSVHLSPCFRDVQ^ 

AAGTKKQFQKF (SEQ ID NO:663); MADIQTERAYQKQCTIFQNKKRVLl-GET 
GK (SEQ ID NO:664); HCHPPRLSALHPQVQPLREAPQEHVCTPVPL LQGRPDR 
(SEQ ID NO:666); NIGLGFKDTPRRLLRGTYIDKKCPFTGNVSIRGRILSGVVTQ 
(SEQ ID NO:669); MK^IQRTIVIRRDYLHYIRKWRFEKRHKNMSVHLSP (SEQ 
ID NO:667); CFRDVQIGDI\TVGECRPLSKTVRFNVLKVTK,^AGTKKQFQKF 
(SEQ ID NO:668). Also preferred are polynucleotide fragments encoding these - 
polypeptide fragments. 

This gene is expressed primarily in Wilm's tumor and to a lesser extent in 
thymus and stromal ceils. - 

Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s) or cell type(s) present in a 
biological sample and for diagnosis of diseases and conditions, which include, but are 
not limited to, diseases affecting RNA translation. Similarly, polypeptides and 
antibodies directed to these polypeptides are useful in providing immunological probes 
for differential identification of the tissue(s) or ceil type(s). For a number of disorders 
of the above tissues or cells, particularly of the Wilm's tumors, expression of this gene 
at significantly higher or lower levels may be routinely detected in certain tissues (e.g., 
cancerous and wounded tissues) or bodily fluids (e.g., serum, plasma, urine, synovial 
fluid or spinal fluid) or another tissue or cell sample taken from an individual having 
such a disorder, relative to the standard gene expression level, i.e., the expression level 
in healthy tissue or bodily fluid from an individual not having the disorder. Preferred 
epitopes include those comprising a sequence shown in SEQ ID NO: 362 as residues: 
Thr-11 to Asp-20. 

The tissue distribution and homology to a ribosomal protein indicates that 
polynucleotides and polypeptides corresponding to this gene are useful for diseases 
affecting RNA translation. 

FEATURES OF PROTEIN ENCODED BY GENE NO: 130 

The translation product of this gene shares sequence homology with a yeast 
DNA helicase which is thought to be important in global transcriptional regulation (See 
Accession No. gnllPIDIe243594). One embodiment for this gene is the polypeptide 
fragments comprising the following amino acid sequence: IFYDSDWNPTVDQQA 
MDRAHRLGQTKQVTVYRUCXGT^ (SEQ ID 

NO: 670); TRMIDLLEEYMVYRKOT 

FVFLLSTRAGGLGINLTAXDTVHF (SEQ ID NO:671); TRMIDLLEEYMVYRK 
HTYXRLDGSSKISERRDM (SEQ ID NO:674); RRDMVADFQNRND1FVFLL 
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STRA GG L GINLT AXDT VHF (SEQ ID NO:675) , IFYDS D WNPTVDQQ AMD 
RAHRLGQTKQVTVYRLICKG (SEQ ID NO:676); RLICKGTEEERILQRAK 
EKS EIQRMVISG (SEQ ID NO:678). An additional embodiment is the polynucleotide 
fragments encoding these polypeptide fragments. 
5 - This gene is expressed primarily in amygdala. 

Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s) or cell type(s) present in a 
biological sample and for diagnosis of diseases and conditions, which include, but are 
not limited to, diseases and disorders of the brain. Similarly, polypeptides and 

10 antibodies directed to these polypeptides are useful in providing immunological probes 
for differential identification of the tissue(s) or cell type(s). For a number of disorders 
of the above tissues or cells, particularly of the central nervous system, expression of 
this gene at significantly higher or lower levels may be routinely detected in certain 
tissues (e.g., cancerous and wounded tissues) or bodily fluids (e.g., serum, plasma, 

15 urine, synovial fluid or spinal fluid) or another tissue or cell sample taken from an 
. individual having such a disorder, relative to the standard gene expression level; i.e., 
the expression level in healthy tissue or bodily fluid from an individual not having the 
disorder. 

The tissue distribution and homology to a DNA helicase indicates that 
20 polynucleotides and polypeptides corresponding to this gene are useful for diseases 

affecting RNA transcription, particularly developmental disorders and healing wounds 
since the later are though to approximate developmental transcriptional regulation. 

FEATURES OF PROTEIN ENCODED BY GENE NO: 131 
25 This gene is expressed primarily in prostate and to a lesser extent in amygdala 

and pancreatic tumors. 

Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s) or cell type(s) present in a 
biological sample and for diagnosis of diseases and conditions, which include, but are 
30 not limited to, prostate enlargement and gastrointestinal disorders, particularly of the 
pancreas and gall bladder. Similarly, polypeptides and antibodies directed to these 
polypeptides are useful in providing immunological probes for differential identification 
of the tissue(s) or cell type(s). For a number of disorders of the above tissues or cells, 
particularly of the reproductive system, expression of this gene at significantly higher or 
35 lower levels may be routinely detected in certain tissues (e.g., cancerous and wounded 
tissues) or bodily fluids (e.g., serum, plasma, urine, synovial fluid or spinal fluid) or 
another tissue or cell sample taken from an individual having such a disorder, relative to 
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the standard gene expression level, i.e., the expression level in healthy tissue or bodily 
fluid from an individual not having the disorder. 

The tissue distribution indicates that polynucleotides and polypeptides 
corresponding to this gene are useful for treatment and diagnosis of prostate diseases, 
including benign prostatic hyperplasia and prostate cancer. In addition, the tissue 
distribution in tumors of the pancreas indicates that polynucleotides and polypeptides 
corresponding to this gene are useful for diagnosis and intervention of these tumors, in 
addition to other tissues where expression has been indicated. Protein, as well as, 
antibodies directed against the protein may show utility as a tumor marker and/or 
immunotherapy targets for the above listed tumors and tissues. 

FEATURES OF PROTEIN ENCODED BY GENE NO: 132 

This gene is expressed primarily in adult lung and to a lesser extent in 
hypothalamus. 

Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s) or cell type(s) present in a 
biological sample and for diagnosis' of diseases and conditions, which include, but are 
not limited to, pulmonary diseases and neurological disorders. Similarly, polypeptides 
and antibodies directed to these polypeptides are useful in providing immunological 
probes for differential identification of the tissue(s) or cell type(s). For a number of 
disorders of the above tissues or cells, particularly of the pulmonary and respiratory 
systems, expression of this gene at significantly higher or lower levels may be routinely 
detected in certain tissues (e.g., cancerous and wounded tissues) or bodily fluids (e.g., 
serum, plasma, urine, synovial fluid or spinal fluid) or another tissue or cell sample 
taken from an individual having such a disorder, relative to the standard gene 
expression level, i.e., the expression level in healthy tissue or bodily fluid from an 
individual not having the disorder. 

The tissue distribution indicates that polynucleotides and polypeptides 
corresponding to this gene are useful for diagnosis and treatment of pulmonary and 
. respiratory disorders such as emphysema, pneumonia, and pulmonary edema and 
emboli. In addition, the tissue distribution indicates that polynucleotides and 
polypeptides corresponding to this gene are useful for the detection/treatment of 
neurodegenerative disease states and behavioral disorders such as Alzheimer's Disease, 
Parkinson's Disease, Huntington's Disease, schizophrenia, mania, dementia, paranoia, 
obsessive compulsive disorder and panic disorder. In addition, the gene or gene 
product may also play a role in the treatment and/or detection of developmental 
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disorders associated with the developing embryo, sexually-linked disorders, or 
disorders of the cardiovascular system. 

FEATURES OF PROTEIN ENCODED BY GENE NO: 133 
5 This gene is expressed primarily in human liver. 

Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s) or cell type(s) present in a 
biological sample and for diagnosis of diseases and conditions, which include, but are 
not limited to, cirrhosis of the liver and other hepatic disorders. Similarly, polypeptides 

10 and andbodies directed to these polypeptides are useful in providing immunological 
probes for differential identification of the tissue(s) or cell type(s). For a number of 
disorders of the above tissues "or celTs, particularly of the digestive system, expression 
. - of this gene at signincandy higher or lower levels may be routinely detected in certain 
tissues (e.g., cancerous and wounded tissues) or bodily fluids (e.g., serum, plasma, 

15 urine, synovial fluid or spinal fluid) or another tissue or ceil sample taken from an 

individual having such a disorder, relative to the standard gene expression level, i.e., 
the expression level in healthy tissue or bodily fluid from an individual not having the 
disorder. 

The tissue distribution indicates that polynucleotides and polypeptides 
20 corresponding to this gene are useful for diagnosis and treatment of liver disorders such 
as cirrhosis, jaundice, and Hepatitus. Protein, as well as, antibodies directed against the 
protein may show utility as a tumor marker and/or immunotherapy targets for the above 
listed tissues. 

25 FEATURES OF PROTEIN ENCODED BY GENE NO: 134 

This gene is expressed primarily in fetal kidney and to a lesser extent in fetal 
liver and spleen. 

Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s) or cell type(s) present in a 

30 biological sample and for diagnosis of diseases and conditions, which include, but are 
not limited to, development and regeneration of liver and kidney and immunological 
disorders. Similarly, polypeptides and antibodies directed to these polypeptides are 
useful in providing immunological probes for differential identification of the tissue(s) 
or cell xype(s). For a number of disorders of the above tissues or cells, particularly of 

35 the digestive and excretory systems, expression of this gene at significantly higher or 
lower levels may be routinely detected in certain dssues (e.g., cancerous and wounded 
tissues) or bodily fluids (e.g.. serum, plasma, urine, synovial fluid or spinal fluid) or. 
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another tissue or cell sample taken from an individual having such a disorder, relative to 
the standard gene expression level, i.e., the expression level in healthy tissue or bodily 
fluid from an individual not having the disorder. Preferred epitopes include those 
comprising a sequence shown in SEQ ID NO: 367 as residues: Pro-70 to Arg-77, Tyr- 
5 102 to Thr-107. 

The tissue distribution indicates that polynucleotides and polypeptides 
corresponding to this gene are useful for diagnosis and treatment of diseases of the 
kidney and liven such as cirrhosis, kidney failure, kidney stones, and liver failure, 
hepatoblastoma, jaundice, hepatitis, liver metabolic diseases and conditions that are 
10 attributable to the differentiation of hepatocyte progenitor cells. In addition the 

expression in fetus would suggest a useful role for the protein product in developmental 
abnormalities, fetal deficiencies, pre-natal disorders and various would-healing models 
and/or tissue trauma. 

15 FEATURES OF PROTEIN ENCODED BY GENE NO: 135 

This gene is expressed primarily in brain, bone marrow, and to a lesser extent in 
placenta, T cell, testis and neutrophils. 

Therefore, polynucleotides and polypeptides of the invention are useful as 

reagents for differential identification of the tissue(s) or cell type(s) present in a 

20 biological sample and for diagnosis of diseases and conditions, which include, but are 

not limited to, neurodegenerative and immunological diseases and cancer. Similarly, 

polypeptides and antibodies directed to these polypeptides are useful in providing 

immunological probes for differential identification of the tissue(s) or cell type(s). For 

a number of disorders of the above tissues or cells, particularly of the nervous and 

25 immune systems, expression of this gene at significantly higher or lower levels may be 

j 

routinely detected in certain tissues (e.g., cancerous and wounded tissues) or bodily 
fluids (e.g., serum, plasma, urine, synovial fluid or spinal fluid) or another tissue or 
cell sample taken from an individual having such a disorder, relative to the standard 
gene expression level, i.e., the expression level in healthy tissue or bodily fluid from an 

30 individual not having the disorder. Preferred epitopes include those comprising a 
sequence shown in SEQ ID NO: 368 as residues: Met- 1 to His-6. 

The tissue distribution indicates that polynucleotides and polypeptides 
corresponding to this gene are useful for the detection/treatment of neurodegenerative 
disease states and behavioral disorders such as Alzheimer's Disease, Parkinson's 

35 Disease, Huntingtons Disease, schizophrenia, mania, dementia, paranoia, obsessive 

compulsive disorder and panic disorder. In addition, the gene or gene product may also 
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play a role in the treatment and/or detection of developmental disorders associated with 
the developing embryo, or sexually-linked disorders. 

FEATURES OF PROTEIN ENCODED BY GENE NO: 136 
5 Translation product of this gene is homologous to the human WD repeat 

protein HANI 1. Preferred polypeptide fragments comprise the following amino acid 
sequence: 

MSLHGKRKEIYKYEAPWWYAiV^ 

LDEESSE^CRNTFDHPYPTTKL^1WIPDTKGVYPDLLATSGD\T.RVWRVGETET 
10 RiECLLNNNKNSDFCAPLTSFDWNEVDPYLLGTSSIDTTCTTW 
NLVSGHVKTQLIAHDKEVYD^ 
STIIYEDPQHHPLLRLCWN 

HVSiVL\LLGPHIHPATS,\LQRiVlTTRLSSGTSSKCPEPLRTLSWTTQLXGEINNVQ 
WASTQPELSPSATTTAWRYSECSVGGAVPTRQGLLYFLPLPHPQS (SEQ ID 

1 5 NO:679); MSLHGKRKEIYKYnEAP\VTVY.\iVIN^3VRPDKRFRL.ALGSFV 

EEWNKVQLVGLDEESSEFICRNTFDHPYPTTKL^VIPDTKGVYPDLLATSGDY 
LRVWRVGETETRLECLLNNNKNSDFCAPLTSFDWNEVDPYLL (SEQ ID 
NO^SO^^FDV^-NEVDP^LGTSSIDTTCTIWGLETGQVLGRVNLVSGHVK 
TQLLAHDKEVYDLAFSRAGGGRJDi^^ 

20 HPLLRLCWKQDPNYLATMAMDGN^^ (SEQ ED 

NO:68 1); VGADGSVmFDLRHLEHSTIIY^DPQHHPLLRLCWNKQDPNYLA 
TMAMDGMEVVILDVRVPAHLXPGTTE 

SGTSSKCPEPLRTLSWPTQLXGEINNVQWASTQPELSPSATTTAWRYSECSVG 
GAVPTRQGLLYFLPLPHPQS (SEQ ID NO:682). Also preferred are polynucleotide 

25 fragments encoding these polypeptide fragments. 

This gene is expressed primarily in placenta, embryo, T cell and fetal lung and 
to a lesser extent in endothelial, tonsil and bone marrow. 

Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s) or cell type(s) present in a 

30 biological sample and for diagnosis of diseases and conditions, which include, but are 
not limited to, immunological and developmental diseases in addition to cancers. 
. Similarly, polypeptides and antibodies directed to these polypeptides are useful in 
providing immunological probes for differential identification of the tissue(s) or cell 
type(s). For a number of disorders of the above tissues or cells, particularly of the 

35 immune system, expression of this gene at significandy higher or lower levels may be 
routinely detected in certain tissues (e.g., cancerous and wounded tissues) or bodily 
fluids (e.g., serum, plasma, urine, synovial fluid or spinal fluid) or another tissue of- 
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cell sample taken from an individual having such a disorder, relative to the standard 
gene expression level, i.e., the expression level in healthy tissue or bodily fluid from an 
individual not having the disorder. Preferred epitopes include those comprising a 
sequence shown in SEQ ID NO: 369 as residues: Gly-19 to Gln-28, Pro-36 to Phe-42. 
5 The tissue distribution in tumors of colon, ovary, and breast origins indicates 

that polynucleotides and polypeptides corresponding to this gene are useful for 
diagnosis and intervention of these tumors, in addition to other tumors where 
expression has been indicated. Protein, as well as, antibodies directed against the 
protein may show utility as a tumor marker and/or immunotherapy targets for the above 
10 listed tumors and tissues. Because the gene is expressed in cells of lymphoid origin, the 
natural gene product may be involved in immune functions. Therefore it may be also 
used as an agent for immunological disorders including arthritis, asthma, immune 
deficiency diseases such as AIDS, and leukemia. 

15 FEATURES OF PROTEIN ENCODED BY GENE NO: 137 

This gene is expressed primarily in TNF and INF induced epithelial cells, T 
cells and kidney. 

Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s) or cell type(s) present in a 

20 biological sample and for diagnosis of diseases and conditions, which include, but are 
not limited to, inflammatory conditions particularly inflammatory reactions in the 
kidney. Similarly, polypeptides and antibodies directed to these polypeptides are useful 
in providing immunological probes for differential identification of the tissue(s) or cell 
type(s). For a number of disorders of the above tissues or cells, particularly of renal 

25 . system, expression of this gene at significantly higher or lower levels may be routinely 
detected in certain tissues (e.g., cancerous and wounded tissues) or bodily fluids (e.g., 
serum, plasma, urine, synovial fluid or spinal fluid) or another tissue or cell sample 
. taken from an individual having such a disorder, relative to the standard gene 
expression level, i.e., the expression level in healthy tissue or bodily fluid from an 

30 individual not having the disorder. Preferred epitopes include those comprising a 

sequence shown in SEQ ID NO: 370 as residues: Thr-67 to Gly-72, Gin- 132 to Ala- 
145, Arg-150 to Pro- 157. 

The tissue distribution indicates that the protein products of this gene are useful 
for treating the damage caused by inflammation of the kidney. 

35 
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FEATURES OF PROTEIN ENCODED BY GENE NO: 138 

This gene maps to chromosome 1 , and therefore, may be used as a marker in 
linkage analysis for chromosome 1 (See Accession No. D63485). 

This gene is expressed primarily in breast cancer and colon cancer and to a 
5 lesser extent in thymus and fetal spleen. 

Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the dssue(s) or cell type(s) present in a 
biological sample and for diagnosis of diseases and conditions, which include, but are 
not limited to, cancers, especially of the breast and colon tissues. Similarly, 
10 polypeptides and antibodies directed to these polypeptides are useful in providing 

immunological probes for differential identification of the tissue(s) or cell type(s). For 
a number of disorders of the above tissues or cells, particularly of the immune system, 
- expression of this gene at significantly higher or lower levels may be routinely detected 
in certain tissues (e.g., cancerous and wounded tissues) or bodily fluids (e.g., serum, 
15 plasma, urine, synovial fluid or spinal fluid) or another tissue or cell sample taken from 
an individual having such a disorder relative to the standard gene expression level, i.e., 
the expression level in healthy tissue or bodily fluid from an individual not having the 
disorder. 

The tissue distribution in tumors of colon and breast origins indicates that 
20 polynucleotides and polypeptides corresponding to this gene are useful for diagnosis 

and intervention of these tumors, in addition to other tumors where expression has been 
indicated. Protein, as well as, antibodies directed against the protein may show utility as 
a tumor marker and/or immunotherapy targets for the above listed tumors and tissues. 

25 FEATURES OF PROTEIN ENCODED BY GENE NO: 139 

This gene maps to chromosome 17, and therefore, can be used as a marker for 
linkage analysis from chromosome 17. 

This gene is expressed primarily in CD34 positive cells, and to lesser extent in 
activated T-cells and neutrophils. 
30 Therefore, polynucleotides and polypeptides of the invention are useful as 

reagents for differential identification of the tissue(s') or cell type(s) present in a 
biological sample and for diagnosis of diseases and conditions, which include, but are 
not limited to, immunologically related diseases and hematopoietic disorders. Similarly, 
polypeptides and antibodies directed to these polypeptides are useful in providing 
35 immunological probes for differential identification of the tissue(s) or cell type(s). For 
a number of disorders of the above tissues or cells, particularly of the immune system 
and hematopoietic system, expression of this gene at significantly higher or lower levels 
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may be routinely detected in certain tissues (e.g.. cancerous and wounded tissues) or 
bodily fluids- (e.g., serum, plasma, urine, synovial fluid or spinal fluid) or another 
tissue or cell sample taken from an individual having such a disorder, relative to the 
standard gene expression level, i.e., the expression level in healthy tissue or bodily 
5 fluid from an individual not having the disorder. 

The tissue distribution in CD34, T-cell and neutrophils indicates that . 
polynucleotides and polypeptides corresponding to this gene are useful for treatment 
and diagnosis of hematopoietic disorders and immunologically related diseases, such as 
anemia, leukemia, inflammation, infection, allergy, immunodeficiency disorders, 
10 arthritis, asthma, immune deficiency diseases' such as AIDS.' * 

FEATURES OF PROTEIN ENCODED BY GENE NO: 140 

■This gene was recently cloned by another group, who called the gene 
KIAA0313 gene. (See Accession No. d 102 1609.) Preferred polypeptide fragments 

15 comprise the amino acid sequence: 

LYATATV1SSPSTEXLSQDQGDRASLDAADSGRGSWTSCSSGSHDN1QTIQ 
HQRSWETLPFGHTHFDYSGDPAGLWASSSHMDQIMFSDHSTKYNRQNQSRES 
LEQAQSRASWASSTGYWGEDSEGDTGTIKRRGGKDVSCE.\ESSSLTSVTTEETK 
PVPMPAHIAVASSTTKGLIARKEGRYREPPPTPPGYIGIPITDFPEGHSHPARKP 

20 PDYNVALQRSRMVARSSDTAGPSSVQQPHGHPTSSRPVNKPQWHKXNESDPR 
LAPYQSQGFSTEEDEDEQVSAV (SEQ ID NO:683); HMDQIMFSDHSTKYNRQ 
NQSRESLEQAQSRASWASSTGYWGE (SEQ ID NO:684); S VTTEETKP VPMP 
AHIAVASSTTKGLIARKEGRYREPPPTPPGYIGIPITD (SEQ ID NO.685); and 
V ALQRS RMV ARSSDT AGPSS VQQPHGHPTSSRP VNKPQ W 

25 HKXNESDPRLAPYQSQGF (SEQ ID NO:686). Also preferred axe polynucleotide 
fragments encoding these polypeptide fragments. This gene maps to chromosome 4, 
and therefore, may be used as a marker in linkage analysis for chromosome 4 (See 
Accession No. AB0023I1 ). 

This gene is expressed primarily in ovarian cancer, tumors of the Testis, brain, 

30 and colon. 

Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s) or cell type(s) present in a 
biological sample and for diagnosis of diseases and conditions, which include, but are 
not limited to, ovarian, testicle, brain and colon cancers. Similarly, polypeptides and 
35 antibodies directed to these polypeptides are useful in providing immunological probes 
for differential identification of the tissue(s) or cell type(s). For a number of disorders 
of the above tissues or cells, particularly of the male and female reproductive systems. 
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expression of this gene at significantly higher or lower levels may be routinely detected 
in certain tissues (e.g., cancerous and wounded tissues) or bodily fluids (e.g., serum, 
plasma, urine, synovial fluid or spinal fluid) or anocher tissue or cell sample taken from 
an individual having such a disorder, relative to the standard gene expression level, i.e., 
5 the expression level in healthy tissue or bodily fluid from an individual not having the 
disorder. 

The -tissue distribution in tumors of colon, ovary, testis, and brain origins 
indicates that polynucleotides and polypeptides corresponding to this gene are useful for 
diagnosis and intervention of these tumors, in addition to other tumors where 
10 expression has been indicated. Protein, as well as, antibodies directed against the 

protein may show utility as a tumor marker and/or immunotherapy targets for the above 
listed tumors and tissues. 

FEATURES OF PROTEIN ENCODED BY GENE NO: 141 

15 This gene is expressed primarily in spieen and colon cancer. 

Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s) or cell type(s) present in a 
biological sample and for diagnosis of diseases and conditions, which include, but are 
not limited to, colon cancer and immunological disorders. Similarly, polypepddes and 

20 antibodies directed to these polypeptides are useful in providing immunological probes 
for differential identification of the tissue(s) or cell type(s). For a number of disorders 
of the above tissues or cells, particularly of the gastrointestinal trace and immune 
systems, expression of this gene at significantly higher or lower levels may be routinely 
detected in certain tissues (e.g., cancerous and wounded tissues) or bodily fluids (e.g., 

25 serum, plasma, urine, synovial fluid or spinal fluid) or another tissue or cell sample 
taken from an individual having such a disorder, relative to the standard gene 
expression level, i.e., the expression level in healthy tissue or bodily fluid from an 
individual not having the disorder. 

The tissue distribution in tumors of colon, ovary, and breast origins indicates 

30 that polynucleotides and polypeptides corresponding to this gene are useful for 
diagnosis, and intervention of these tumors, in addition to other tumors where 
expression has been indicated. Protein, as well as, antibodies directed against the 
protein may show utility as a tumor marker and/or immunotherapy targets for the above 
listed tumors and tissues. 



35 
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FEATURES OF PROTEIN ENCODED BY GENE NO: 142 

Translation product is homologous to T cell translocation protein, a putative zinc 
finger factor (See Accession No. 340454), as well as to the G-protein coupled receptor 
TM5 consensus polypeptide (See Accession No. R50734). Preferred polypeptide 
5 fragments comprise the following amino acid sequence: 

CLLFVFV*SLGMRCLFV/TIVWVL\TKHKCNTVLLC\T-ILCSI (SEQID NO:687); 
ACSKLIPAFEMVMRAKDNVYHLDCFACQ 

DYEEGLMKEGYAPXVR (SEQ ID NO:68S). Also preferred are polynucleotide 
fragments encoding these polypeptide fragments. 

10 This gene is expressed .primarily in fetal brain. 

Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s) or celJ type(s) present in a 
biological sample and for diagnosis of diseases and conditions, which include, but are 
not limited to, neurological disorders including brain cancer. Similarly, polypeptides 

15 and antibodies directed to these polypeptides are useful in providing immunological 
probes for differential identification of the tissue(s) or cell type(s). For a number of 
disorders of the above tissues or cells, particularly of the Central Nervous System, 
expression of this gene at significantly higher or lower levels may be routinely detected 
in certain tissues (e.g., cancerous and wounded tissues) or bodily fluids (e.g., serum, 

20 plasma, urine, synovial fluid or spinal fluid) or another tissue or cell sample taken from 
an individual having such a disorder, relative to the standard gene expression level, i.e., 
the expression level in healthy tissue or bodily fluid from an individual not having the 
disorder. 

The tissue distribution indicates chat polynucleotides and polypeptides 
25 corresponding to this gene are useful for the detection/treatment of neurodegenerative 
disease states and behavioral disorders such as Alzheimer's Disease, Parkinson's 
Disease, Huntingtons Disease, schizophrenia, mania, dementia, paranoia, obsessive 
compulsive disorder and panic disorder. In addition, the gene or gene product may also 
play a role in the treatment and/or detection of developmental disorders associated with 
30 the developing embryo. 

FEATURES OF PROTEIN ENCODED BY GENE NO: 143 

Translation product for this gene has significant homology to the Fas ligand, 
which is a cysteine-rich type II transmembrane protein/tumor necrosis factor receptor 
35 homolog. Mutations within this protein have been shown to result in generalized 
lymphoproliferative disease leading to the development of lymphadenopathy and 
autoimmune disease (See Medline .Article No. 94185 175). Preferred polypeptide 
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fragments comprise the following amino acid sequence: 

SALSEPGAPDRRRPCPESVPRRPDDEQWPPPTALCLDVAPLPPSS (SEQ ID 
NO:689). Also preferred are polynucleotide fragments encoding these polypeptide 
fragments (See Accession No. 473565). 

5 This gene is expressed primarily in osteoblasts, lung, and brain. 

Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differencial identification of the dssue(s) or cell type(s) present in a 
biological sample and for diagnosis of diseases and conditions, which include, but are 
not limited to, osteoblast-related, pulmonary, neurological, and immunological 

10 diseases- Similarly, polypeptides and antibodies directed to these polypeptides are 

useful in providing immunological probes for differential identification of the tissue(s) 
... or cell type(s). For a number of disorders of the above tissues or cells, particularly of 
the skeletal and nervous systems, expression of this gene at significantly higher or 
lower levels may be routinely detected in certain tissues (e.g., cancerous and wounded 

15 tissues) or bodily fluids (e.g., serum, plasma, urine, synovial fluid or spinal fluid) or 
another tissue or cell sample taken from an individual having such a disorder, relative to 
the standard gene expression level, Le., the expression level in healthy tissue or bodily 
fluid from an individual not having the disorder. Preferred epitopes include those 
comprising a sequence shown in SEQ ID NO: 376 as residues: Trp-33 to Thr-40, Lys- 

20 45 toIle-63. 

The tissue distribution in osteoblasts, lung, and brain combined with its 
homology to the Fas ligand indicates that polynucleotides and polypeptides 
corresponding to this gene are useful for diagnosis and intervention of these tumors, in 
addition to other tumors where expression has been indicated. Protein, as well as, 

25 antibodies directed against the protein may show utility as a rumor marker and/or 

immunotherapy targets for the above listed tumors and tissues. Because the Fas ligand 
gene is known to be expressed in cells of lymphoid origin, the natural gene product 
may be involved in immune functions. Therefore it may be also used as an agent for 
immunological disorders including asthma, immune deficiency diseases such as AIDS 

30 and leukemia, and various autoimmune disorders including lupus and arthritis. 

FEATURES OF PROTEIN ENCODED BY GENE NO: 144 

This gene shares sequence homology with a 21.5 KD transmembrane protein in 
the SEC15-SAP4 intergenic region of yeast. (See Accession No. 1723971.) Preferred 
35 polypeptide fragments comprise the amino acid sequence: 

AHASESGERWWACCGVRFGLRSIEAIGRSCCHDGPGGLVANRGRRFKWAIEL 
SGPGGGSRGRSDRGSGQGDSL\TVGYLDKQVPDTSVQETDRILVEKRCWDLAL 
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GPLKQIPMNLFIMYN'L^^ 
KFLQGLVTLIGNIJVrc 

GLLL (SEQ ID NO:69l); PVGYLDKQVPDTSVQETDRILVEKRC\VT)L\LGPLKQ 
IPM^LFI (SEQ ID NO:693); and ATFIQvILESSSQKFLQGLV^XIGNLiMGL.^LAV 
YKCQSMGLLPTHASD (SEQ ID NO:692). Also preferred are polynucleotide 
fragments encoding these polypeptide fragments. 

This gene is expressed primarily in osteoclastoma, hemangiopericytoma, liver, 

lung. 

Therefore, polynucleotides and polypeptides of the invention are useful as 
reagencs for differential identification of the tissue(s) or cell type(s) present in a 
biological sample and for diagnosis of diseases and conditions, which include, but are 
not limited to, osteoclastoma, hemangiopericytoma, liver and lung tumors. Similarly, 
polypeptides and antibodies directed to these polypepddes are useful in providing 
immunological probes for differential identification of the above tissue(s) or cell 
type(s). For a number of disorders of the above tissues or cells, particularly of the lung 
and liver systems, expression of this gene at significantly higher or lower levels may be 
routinely detected in certain tissues (e.g., cancerous and wounded tissues) or bodily 
fluids (e.g., serum, plasma, urine, synovial, fluid or spinal fluid) or another tissue or 
cell sample taken from an individual having such a disorder, relative to the standard 
gene expression level, i.e., the expression level in healthy tissue or bodily fluid from an 
individual not having the disorder. 

The tissue distribution indicates that polynucleotides and polypeptides 
corresponding to this gene are useful for diagnosing osteoclastoma, 
hemangiopericytoma, liver and lung tumors. 

FEATURES OF PROTEIN ENCODED BY GENE NO: 145 

Translation product of this gene shares homology with the giucagon-69 gene 
c which may indicate this gene plays a role in regulating metabolism. (See Accession No. 
A60313) One embodiment for this gene is the polypeptide fragments comprising the 
following amino acid sequence: 




NO:694). An additional embodiment is the polynucleotide fragments encoding these 
polypeptide fragments. 



This gene is expressed primarily in brain, kidney, colon, and testis. 

Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s) or cell type(s) present in a . 
biological sample and for diagnosis of diseases and conditions, which include, 'but are 
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not limited to, brain, kidney, colon, and testicular cancer. Similarly, polypeptides and 
antibodies directed to these polypeptides are useful in providing immunological probes 
for differential identification of the tissue(s) or cell type(s). For a number of disorders 
of the above tissues or cells, particularly of the male reproductive system, neurological, 
5 circulatory, and gastrointestinal systems, expression of this gene at significantly hisher 
or lower levels may be routinely detected in certain tissues (e.g., cancerous and 
wounded tissues) or bodily fluids (e.g. 7 serum, plasma, urine,. synovial fluid or spinal 
fluid) or another dssue or cell sample taken from an individual having such a disorder, 
relative to the standard gene expression level, i.e., the expression level, in healthy tissue 

10 or bodily fluid from an individual not having the disorder. 

The tissue distribution in tumors of brain, kidney, colon, and testis origins, 
indicates that polynucleotides and polypeptides corresponding to this gene are useful for 
. diagnosis and intervention of these tumors, in addition to other tumors where 
expression has been indicated. Protein, as well as, antibodies directed against the 

1 5 protein may show utility as a tumor marker and/or immunotherapy targets for the above 
listed tumors and tissues. The dssue distribution indicates that polynucleotides and 
polypeptides corresponding to this gene are useful for the detection/treatment of 
neurodegenerative disease states and behavioral disorders such as. Alzheimer's Disease, 
Parkinson 7 s Disease, Huntingtons Disease, schizophrenia, mania, dementia, paranoia, 

20 obsessive compulsive disorder and panic disorder. In addition, the gene or gene 
product may also play a role in the treatment and/or detection of developmental 
disorders associated with the developing embryo, sexually-linked disorders, or 
disorders of the cardiovascular system. 

25 FEATURES OF PROTEIN ENCODED BY GENE NO: 146 

The translation product of this gene shares sequence homology with goliath 
protein which is thought to be important in the regulation of gene expression during 
development. Protein may serve as a transcription factor. One embodiment for this gene 
is the polypeptide fragments comprising the following amino acid sequence: 

30 TEHILWMITELRGKDILSYL^KMSVQM 
LMnSSAWLIFYFIQKIRYTNARp 
PDroHCAVCIESYKQNT)VVR 

LGIV (SEQ ID NO:695); TEHIL^VMITELRGKDILS YLEKNIS VQMTIAVGTRMP 
PKNFSRGSLVFVSISFIVLM HSSAWLIFYF (SEQ ID NO: 697); .SISFIVLMIISSA 
35 WLffYTTQ KIR YTN ARID ^ (SEQ ID 

NO:698); VKKGDKETDPDFDHCAVCIESYKQNDVVRILPCKHVFHKSCVDP 



WO 98/54963 



PCT/US98/U422 



WLSEHCTCPMCKLNELKALGIV (SEQ ID NO.699). An additional embodiment is 
the polynucleotide fragments encoding these polypeptide fragments (See Accession No. 
157535). Moreover, another embodiment is the polynucleotide fragments encoding 
these polypeptide fragments: 

5 MTHPGTEHHA VMTTELRGKDILS YLEKNIS VQMTIAVGTRMPPKNFSRGS 

L VFV SISFIVLMIISS A WXIFYHQKIR^TN ARI)RNQRRLGD A AJGCMSKLTTRTV 
KKGDKETDPDFDHCAVCIESYKQNDVVRILPCKHVFHKSCVDPWLSEHCTCP 
MCKLN^LKALGIVP^^LPCTDN^AFD^1ERLTRTQAV^^^SALGDLAGDNSLGLE 
PLRTSGISPLPQDGELTPRTGEINlAVTKEWniASFGLLSALTLCYiVIIIRATASLN 

1.0 ANEVEWF (SEQ ID NO:696);MTHPGTEHIlAVMITELRGKDILSriJ£KNISVQM 
TL\VGTPJvlPPKiNTSRGSLVF\"SISFIVLMIISSAWLIFYFIQKlRYTNARDRNQRR 
LGDAAKKAISKLTTRT (SEQ ID NO:700); AAKKAIS KLTTRTVKKGDKE 
TDPDFDHCAVCIESYKQiN'DVVRILPCKHVFHKSCVDPWLSEHCTCPMCKLNIL 
KALGIVPNLPC (SEQ ID NO:701); TQAVNRRSALGDLAGDNSLGLEPLRTSGI 

1 5 SPLPQDGELTPRTGEINLA.VTKEWFIlASFGLLSALTLCVTvinRATASLNANEVEW 
F (SEQ ID NO:702); PLHGVADHLGCDPQTRFFVPPNIKQWTALLQRGNCTF 
KEKISRAAFHNAVAVVIYNNKSKEEPVTMTHPGTEHnAVMrTELRGKDILS^XE 
KNISVQMTIAVGTRMPPKNFSRGSLVFVSISnVLMIISSAWLIFYFIQKIRYTNA 
RDRNQRR1.GDAAKKAISKLTTRTVKKGDKETDPDFDHCAVCIESYKQNDVVRI 

20 LPCKHVFHKSCVDPWLSEHCTGPMCKLNILK.^LGIVPNLPCTDNVAFDMERLT 
RTQAVNRRSALGDLAGDNSLGLEPLRTSGISPLPQDGELTPRTGEIN1AVTKEW 
FILASFGLLSALTLCYMHRATASLNANEVEWF(SEQ ID NO:703); and 
"HGVADHLGCDPQTRFFVPPNIKQWIAJLLQRGNCTFKEKISRAAFHNAVAV-VIY 
NNKSKEE (SEQ ID NO:704). An additional embodiment is the polynucleotide 

25 fragments encoding these polypeptide fragments. When tested against Jurkat cell lines, 
supematants removed from cells containing this gene activated the GAS pathway. 
Thus, it is likely that this gene activates immune cells through the JAKS/STAT signal 
transduction pathway. 

This gene is expressed primarily in macrophage, breast, kidney and to a lesser 

30 extent in synovium, hypothalamus and rhabdomyosarcoma. 

Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s) or cell type(s) present in a 
biological sample and for diagnosis of diseases and conditions, which include, but are 
not limited to, schizophrenia and cancer. Similarly, polypepddes and anybodies directed 

35 to these polypeptides are useful in providing immunological probes for differential 
idendfication of the tissue(s) or cell type(s). For a number of disorders of the above 
tissues or cells, particularly of the immune and neural system, expression of this gene at 
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significandy higher or lower levels may be routinely detected in certain tissues (e.g., 
cancerous and wounded tissues) or bodily fluids (e.g., serum, plasma, urine, synovial 
fluid or spinal fluid) or another tissue or cell sample taken from an individual having 
such a disorder, relative to the standard gene expression level, i.e., the expression level 
5 in healthy tissue or bodily fluid from an individual not having the disorder. 

The tissue distribution and homology to zinc finger protein indicates that 
polynucleotides and polypeptides corresponding to this gene are useful for the treatment 
of schizophrenia, kidney disease and other cancers. The tissue distribution in 
macrophage, breast, and kidney origins indicates that polynucleotides and polypeptides 

10 corresponding to this gene are, useful for diagnosis and intervention of tumors within 
these tissues, in addition to other tumors where expression has been indicated. Protein, 
as well as, antibodies directed against the protein may show utility as a tumor marker 
and/or immunotherapy targets for the above listed tumors and tissues. Because the gene 
is expressed in cells of lymphoid origin, the natural gene product may be involved in 

15 immune functions. Therefore it may be also used as an agent for immunological 

disorders including arthritis, asthma, immune deficiency diseases such as AIDS, and 
leukemia. 

FEATURES OF PROTEIN ENCODED BY GENE NO: 147 
20 The translation product of this gene shares sequence homology with HNP36 

protein, an equilibrative nucleoside transporter, which is thought to be important in 
gene transcription as well as serving as an important component of the nucleoside 
transport apparatus (See Accession No. 1845345). One embodiment for this gene is 
the polypeptide fragments comprising the following amino acid sequence: 
25 MSGQGLAGFFASVAMICALASGSELSES.AFG^TrTACAVIILTnCYLGLPRLEFYR 
YYQQLKLEGPGEQETKLDLISKGEEPRAGKEESGVSVSNSQPTNESHSIICAILK 
NISVLAFSVCFIFTITIGMFPAVTVEVKSSL^ 
RSLTAVFMWPGKDSRWLPSWXL^^ 

FFMAAFAFSNGYLASLCMCFGPKKVKPAEAETAEPSWPSSC^ 
30 PSCSGQLCDKGWTEGLPASLPVCLLPLPS ARGDPEWSGGFFF (SEQ ID 
NO:705); MSGQGLAGFFASV^nCMASGSELSESAFG\TITACAVI[LTnC 
YLGLPRLEFYRYYQQLKLE GPGEQETKLDLISKGEEPRAGKEESGVSVSNSQ 
PTNESHSI (SEQ ID NO:706); SGVSVSNSQPTNESHSIKAILKNISVLAFSVCFI 
FTITIGMFPAVTVEVKSSIAGSSTWERYFIPVSCFT-TFNIFDWLGRS (SEQ ID 
35 NO:707),TOMI^AVTVEVKSSIAGSST 

^VPGKDSRWLPSWXL.ARLVFVPLLLLCNIK PRRYLTVVFEHDA (SEQ ID 
NO:708); FGPKXVKPAE.AETAEPSWPSSCVWV\V-HWGLFSPSCSGQLCDK 



WO 98/54963 



PCTAJS98/11422 



116 

GWTEGLPASLPVCLLPLPSARGDPEWSGGFFF (SEQ ID NO:709). An additional 
embodiment is the polynucleotide fragments encoding these polypeptide fragments. 

This gene is expressed primarily in eosinophils and aortic endothelium and to a 
lesser extent in umbilical vein endothelial cell and thymus. 

5 Therefore, polynucleotides and polypeptides of the invention are useful as 

reagents for differential identification of the tissue(s) or cell type(s) present in a 
biological sample and for diagnosis of diseases and conditions, which include, but are 
not limited to, hematopoietic disease. Similarly, polypeptides and antibodies directed- to 
these polypeptides are useful in providing immunological probes for differencial 

10 identification of the tissue(s) or ceil type(s). For. a number of disorders of the above 
tissues or cells, particularly of the circular system, expression of this gene at 
significantly higher or lower levels may be routinely detected in certain tissues (e.g., 
cancerous and wounded tissues) or bodily fluids (e.g.^ serum, plasma, urine, synovial 
fluid or spinal fluid) or another tissue or cell sample taken from an individual having 

15 such a disorder, relative to the standard gene expression level, i.e., the expression level 
in healthy tissue or bodily fluid from an individual not having the disorder. 

The tissue distribution and homology to HNP36 protein indicates that 
' polynucleotides and polypeptides corresponding to this gene are useful for the treatment 
of blood neoplasias and other hematopoietic disease. 

20 

FEATURES OF PROTEIN ENCODED BY GENE NO: 148 

This gene is expressed primarily in breast cancer cell lines, thymus stromal 
cells, and ovary. 

Therefore, polynucleotides and polypeptides of the invention are useful as 
25 reagents for differential identification of the tissue(s) or cell type(s) present in a 

biological sample and for diagnosis of diseases and conditions, which include, but are 
not limited to, endocrine and female reproductive system diseases including breast 
cancer. Similarly, polypeptides and antibodies directed to these polypeptides are useful 
in providing immunological probes for differential identification of the tissue(s) or cell 
30 type(s). For a number of disorders of the above tissues or cells, particularly of the 

endocrine system, expression of this gene at significantly higher or lower levels may be 
routinely detected in certain tissues (e.g., cancerous and wounded tissues) or bodily 
fluids (e.g., serum, plasma, urine, synovial fluid or spinal fluid) or another tissue or 
cell sample taken from an individual having such a disorder, relative to the standard 
35 gene expression level, i.e., the expression level in healthy tissue or bodily fluid from an 
individual not having the disorder. 
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The tissue distribution indicates that polynucleotides and polypeptides 
corresponding to this geae are useful for the diagnosis and treatment of endocrine 
disorders. In addition, the tissue distribution in tumors of thymus, ovary, and breast 
origins indicates that polynucleotides and polypeptides con-esponding to this sene are 
useful for diagnosis and intervention of these tumors, in addition to other tumors where 
expression has been indicated. Protein, as well as. antibodies directed against the 
protein may show utility as a tumor marker and/or immunotherapy targets for the above 
listed tumors and tissues 

FEATURES OF PROTEIN ENCODED BY GENE NO: 149 

Translation product of this gene has homology to pmtl and pmt 2, two 
conserved schizosaccharomyces pombe genes. One embodiment for this gene is the 
polypeptide fragments comprising the' following amino acid sequence: 
DDDGFEIVPEDPAKHRILD^ 

ELPEWFVQEEKQHRIRQLPVGKKEVEHYRKRWmNARPIXXXXXXXXXXX 
XXXXXXLEQTRKKLAEAVVNTVDIXRTRES (SEQ ID NO:710); 
DDDGFEIVPEDPAKHRILDPEG1LALG (SEQ 
ID NO:71 1); KRWRELNARPIXXXXXXXXXXXXXXXXXLEQTRKICAE 
AVVNTVDIXRTRES (SEQ ID NO:7 12). An additional embodiment is the 
polynucleotide fragments encoding these polypeptide fragments (See Accession No. 
el216734). 

This gene is expressed primarily in retina and ovary and to a lesser extent in 
brreast cancer cell, epididymus and osteosarcoma. . 

Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s) or cell type(s) present in a 
biological sample and for diagnosis of diseases and conditions, which include, but are 
not limited to, neuronal growth disorders, cancer and reproductive system disorders. 
Similarly, polypeptides and antibodies directed to these polypeptides are useful in 
providing immunological probes for differential identification of the tissue(s) or cell 
type(s). For a number of disorders of the above tissues or cells, particularly of the 
neural and reproductive system, expression of this gene at significantly higher or lower 
levels may be routinely detected in certain tissues (e.g., cancerous and wounded 
tissues) or bodily fluids (e.g., serum, plasma, urine, synovial fluid or spinal fluid) or 
another tissue or cell sample taken from an individual having such a disorder, relative to 
the standard gene expression level, i.e., the expression level in healthy tissue or bodily 
fluid from an individual not having the disorder. Preferred epitopes include those 
comprising a sequence shown in SEQ ID NO: 382 as residues: Met-1 to Gly-7. 
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Tne tissue distribution -indicates that polynucleotides and polypeptides 
corresponding to this gene are useful for the diagnosis or treatment of reproductive 
system disease and cancers. 

FEATURES OF PROTEIN ENCODED BY GENE NO: 150 

One embodiment for this gene is the polypeptide fragments comprising the following 

amino acid sequence: 

MIKDKGRARTAiTSSQPAHL 

PLNNKLLNS P AKTLPG ACGS PQ KLIDGFL KHEGP P AEKPLEELS ASTS G VPGLS 
S LQSDP AGCVRPPAPNLAG A VEFNDVKTLLREWITTISDP^EDILQVVKYCTD 
L1EEKDLEKLDLVIKYMKR 

VT (SEQ ID NO:713); MIKDKGRARTALTSSQP.AHLCPENPLLHLKLA.AVKE 

KKRNKKKKTIGSPKRIQ (SEQ ID NO:714): KRIQS PLNNKLLNS PAKT 

LPGACGSPQKLIDGFLKHEGPP AEKPLEELS ASTSGVPGLSSLQSDPAGCVRPP 

APNLAGAVEFNDVKTLLREWTTTISDPM (SEQ ID NO:715); 

TISDP^IEEDrLQVVKYCTDLIEEKDLEKLDLVIKY^IK^LMQQSVE 

S VWN^LAFDFTLDNV Q V VLQQT'YGSTLK VT (SEQ ID NO:7l6). An additional 

embodiment is the polynucleotide fragments encoding these polypeptide fragments. 

This gene is expressed primarily in 12 week embryo and to a lesser extent in 
hemangiopericytoma and frontal cortex. 

Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s) or cell type(s) present in a 
biological sample and for diagnosis of diseases and conditions, which include, but are 
not limited to, growth disorders and hemangiopericytoma. Similarly, polypeptides and 
antibodies directed to these polypeptides are useful in providing immunological probes 
for differential identification of the tissue(s) or cell type(s). For a number of disorders 
of the above tissues or cells, particularly of the circular and neural system, expression 
of this gene at significantly higher or lower levels may be routinely detected in certain 
tissues (e.g., cancerous and wounded tissues) or bodily fluids (e.g., serum, plasma, 
urine, synovial fluid or spinal fluid) or another tissue or cell sample taken from an 
individual having such a disorder, relative to the standard gene expression level, i.e., 
the expression level in healthy tissue or bodily fluid from an individual not having the 
disorder. Preferred epitopes include those comprising a sequence shown in SEQ ID 
NO: 383 as residues: Leu-4 to Lys-1 1. 

The tissue distribution indicates that polynucleotides and polypeptides 
corresponding to this gene are useful for the treatment of growth disorders, 
hemangiopericytoma and other soft tissue tumors. 
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FEATURES OF PROTEIN ENCODED BY GENE NO: 151 

The translation product of this gene has been found to have homology to a 

human DNA mismatch repair protein PMS3. Preferred polypeptide fragments comprise 
5 the following amino acid sequence: FCHDCKFPEASPAMNCEP (SEQ ID NO:7I7). 

Also preferred are polynucleotide fragments encoding these polypeptide fragments (See 

Accession No. R95250).. 

This gene is expressed primarily in neutrophils. 

Therefore, polynucleotides and polypeptides of the invention are useful as 

10 reagents for differential identification of the tissue(s) or cell type(s) present in a 

biological sample and for diagnosis of diseases and conditions, which include, but are 
not limited to, lymphoma, immunodeficiency diseases, and cancers resulting from 
genetic instability. Similarly, polypeptides and antibodies directed to these polypeptides 
are useful in providing immunological probes for differential identification of the 

15 tissue(s) or cell type(s). For a number of disorders of the above tissues or cells, 

particularly of the immune system, expression of this gene at significantly higher or 
lower levels may be routinely detected in certain tissues (e.g., cancerous and wounded 
tissues) or bodily fluids (e.g., serum, plasma, urine, synovial fluid or spinal fluid) or 
another tissue or cell sample taken from an individual having such a disorder, relative to 

20 the standard gene expression level, i.e. /the expression level in healthy tissue or bodily 
fluid from an individual not having the disorder. Prefeired epitopes include those 
comprising a sequence shown in SEQ ID NO: 384 as residues: Mec-1 to Lys-6. 

The tissue distribution in neutrophils and the sequence homology indicates that 
polynucleotides and polypeptides corresponding to this gene are useful for diagnosis of 

25 Hodgkin's lymphoma, since the elevated expression and secretion by the tumor mass 
may be indicative of tumors of this type. Additionally the gene product may be used as 
a target in the immunotherapy of the cancer. Because the gene is expressed in cells of 
lymphoid origin, the natural gene product may" be involved in immune functions. 
Therefore it may be also used as an agent for immunological disorders including 

30 arthritis, asthma, immune deficiency diseases such as AIDS, and leukemia. 

Furthermore, its homology to a known DNA repair protein would suggest gene may be 
useful in establishing cancer predisposition and prevention in gene therapy applications. 

FEATURES OF PROTEIN ENCODED BY GENE NO: 152 
35 . This gene is expressed primarily in neutrophils. 

Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s) or cell type(s) present in a 
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biological sampie and for diagnosis of diseases and conditions, which include, but are 
not limited to, infectious diseases and lymphoma. Similarly, polypeptides and 
antibodies directed to these polypeptides are usefui in providing immunolosicai probes 
for differential identification of the tissue(s) or cell type(s). For a number of disorders 
5 of the above tissues or cells, particularly of the immune system, expression of this sene 
at significantly higher or lower levels may be routinely detected in certain tissues (e.s.. 
cancerous and wounded tissues) or bodily fluids (e.g., serum, plasma, urine } synovial 
fluid or spinal fluid) or another ussue or ceil sample taken from an individual having 
such a disorder, relative to the standard gene expression level, i.e., the expression level 
10 in healthy tissue or -bodily fluid from an individual not having the disorder. 

The tissue distribution indicates that polynucleotides and polypeptides 
corresponding to this gene are useful for the treatment of inflammation and infectious 
diseases. 

15 FEATURES OF PROTEIN ENCODED BY GENE NO: 153 

One embodiment for this gene is the- polypeptide fragments comprising the 
.following amino acid sequence: 

MASSVPAGGHTRAGGIFLIGKLDLEASLFKSFQWLPFVLRKKC 
NFFCWDSSAHSLPLHPLSASCSAPACHASDTHLLYPSTR.ALCPSIFAWLVAPHS 
20 VFRTNAPGPTPSSQSSPVFPVFPVSFMALIVCXLVCC (SEQ ID NO:720); 

MASSVPAGGHTR.^GGIFLIGKLDLEASLFKSFQWLPFVLRKKCNFFCWDSSAH 
SLPLHPLSASCSAPACHA (SEQ ID NO:721);FAWLVAPHSVFRTNAPGPTPS 
SQSSPVFPVFPVSFMALIVCXLVCC (SEQ ID NO:722). An additional embodiment 
is the polynucleotide fragments encoding these polypeptide fragments. 

25 This gene is expressed primarily in neutrophils. - 

Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s) or cell type(s) present in a 
biological sample and for diagnosis of diseases and conditions, which include, but are 
not limited to, inflammation and infectious disease. Similarly, polypeptides and 

30 antibodies directed to these polypeptides are useful in providing immunological probes 
for differential identification of the tissue(s) or cell type(s). For a number of disorders 
of the above tissues or cells, particularly of the immune system, expression of this eene 
at significantly higher or lower levels may be routinely" detected in certain tissues (e.s., 
cancerous and wounded tissues) or bodily fluids (e.g., serum, plasma, urine, synovial 

35 fluid or spinal fluid) or another tissue or cell sample taken from an individual havine 

such a disorder, reladve to the standard gene expression level, i.e., the expression level 
in healthy tissue or bodily fluid from an individual not having the disorder. Preferred 



WO 9S/54963 



PCTAJS9S/11422 



121 



epitopes include chose comprising a sequence shown in SEQ ID NO: 386 as residues: 
Ser-ll to Pro- 17. 

The tissue distribution indicates that polynucleotides and polypeptides 
corresponding to this gene are useful for the treatment of infectious diseases and 
5 inflammation. 

FEATURES OF PROTEIN ENCODED BY GENE NO: 154 

This gene is expressed in multiple tissues including ovary, uterus, adipose 
tissue, brain, and the liver. 

10 Therefore, polynucleotides and polypeptides of the invention are useful as 

reagents for differential identification of the tissue(s) or cell type(s) present in a 
biological sample and for diagnosis of diseases and conditions, which include, but are 
not limited to, uterine,- ovarian, brain, and liver cancer. Similarly, polypeptides and 
antibodies directed to these polypeptides are useful in providing immunological probes 

15 for differential identification of the tissue(s) or cell type(s). For a number of disorders 
of the above tissues or ceils, particularly of the female reproductive system, expression 
of this gene at significantly higher or lower levels may be routinely detected in certain 
tissues (e.g., cancerous and wounded tissues) or bodily fluids (e.g., serum, plasma, 
urine, synovial fluid or spinal fluid) or another tissue or cell sample taken from an 

20 individual having such a disorder, relative to the standard gene expression level, i.e., 
the expression level in healthy tissue or bodily fluid from an individual not having the 
disorder. 

The tissue distribution of this gene indicates that polynucleotides and 
polypeptides corresponding to this gene are useful for diagnostic or therapeutic uses in 
25 the treatment of the female reproductive system, obesity, and liver disorders, 
particularly cancer in the above tissues. 

FEATURES OF PROTEIN ENCODED BY GENE NO: 155 

This gene maps to chromosome 3, and therefore, may be used as a marker in 
30 linkage analysis for chromosome 3 (See Accession No. D87452). 

This gene is expressed in multiple tissues including brain, aortic endothelial 
cells, smooth muscle, pituitary, testis, melancytes, spleen, nertrophils, and placenta. 

Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s) or cell type(s) present in a. 
35 biological sample and for diagnosis of diseases and conditions, which include, but are 
not limited to, immunological disorders including immunodeficiencies, cancers of the 
brain and the female reproductive system, as well as cardiovascular disorders, such as 
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atherosclerosis and stroke. Similarly, polypeptides and antibodies directed to these 
polypeptides are useful in providing immunological probes for differential identification 
of the tissue(s) or cell type(s). For a number of disorders of the above tissues or cells, 
particularly of the central ner/ous and immune systems, expression of this gene at 
5 significantly higher or lower levels may be routinely detected in certain tissues (e.g., 
cancerous and wounded tissues) or bodily fluids (e.g., serum, plasma, urine, synovial 
' fluid or spinal fluid) or another tissue or cell sample taken from an individual havins 
such a disorder, relative to the standard gene expression level, i.e., the expression level 
in healthy tissue or bodily fluid from an individual not having the disorder. 
10 The tissue distribution suggest that polynucleotides and polypeptides 

corresponding to this gene are useful in treatment/detection of disorders in the nervous 
system, including schizophrenia, neurodegeneration, neoplasia, brain cancer as well as 
cardiovascular and female reproductive disorders including cancer within the above 
tissues. 

15 

FEATURES OF PROTEIN ENCODED BY GENE NO: 156 

The translation product ofthis gene shares sequence homology with the human 
gene encoding cytochrome b561 (See Accession No. PI 0897). Cytochrome b561 is a 
transmembrane electron transport protein that is specific to a subset of secretory vesicles 
20 containing catecholamines and amidated peptides. This protein is thought to supply 
reducing equivalents to the intravesicular enzymes dopamine-beta-hydroxylase and 
alpha-peptide amidase. Preferred polypeptides of the invention comprise the amino acid 
sequence: 

MAMEGYWRFLALLGSALLVGFL^ 

25 VLMVTGFVTIQGIAIIYYRLPWTVVK 

NHNVNNL^NMYSLHSWVGLLAVICYLLQLLSGFSVFLLPWAPLSLRAFLMPIHV 
YSGIVIFGTVL^TALMGLTEKLIFSLRDPAYSTFPPEGVFVNTLGLLILVFGALIF 
WIVTRPQWKRPKEPNSTILHPNGGTEQGARGSMPAYSGNNMDKSDSEL 
NSEVAARKRNLALDEAGQRSTM (SEQ ID NO:724); as well as antigenic fragments 

30 of at least 20 amino acids of this gene and/or biologically active fragments. Also 
preferred are polynucleotide fragments encoding these polypeptide fragments. 
This gene is expressed primarily in anergic T-cells. 

Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s) or cell type(s) present in a 
35 biological sample and for diagnosis of diseases and conditions, which include, but are 
not limited to, immune system and metabolism related diseases. Similarly, polypeptides 
and antibodies directed to these polypeptides are useful in providing immunological 
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probes for differential identification of the tissue(s>or cell type(s). For a number of 
disorders of the above tissues or cells, particularly of the immune system, expression of 
this gene at significantly higher or lower levels may be routinely detected in certain 
tissues (e.g., cancerous and wounded tissues) or bodily fluids (e.g.,' serum, plasma, 
5 urine, synovial fluid or spinal fluid) or another tissue or cell sample taken from an 
individual having such a disorder, relative to the standard gene expression level, i.e., 
the expression level in healthy tissue or bodily fluid from an individual not having the 
disorder. 

The tissue distribution indicates that the protein product or RNA of this gene is 
10 useful for treatment or diagnosis of immune system and metabolic diseases or 
conditions including T ay-Sachs disease, phenylketonuria, galactosemia, various 
. porphyrias, and Hurler's- syndrome. 

FEATURES OF PROTEIN ENCODED BY GENE NO: 157 

15 The translation product of this gene shares sequence homology with collagen 

which is important in mammalian development. This gene also shows sequence 
homology with bcl-2. (See Accession No. P80988.) Preferred polypeptide fragments 
comprise the amino acid sequence: PGRAGPSPGLSLQLPAEPGHPAGNLAPL 
TSRPQPLCRIPAVPG (SEQ ID NO:725). Also preferred are polynucleotide 

20. sequences encoding this polypeptide fragment. 

This gene is expressed primarily in HL-60 tissue culture cells and to a lesser 
extent in liver, breast, and uterus. 

Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s) or cell type(s) present in a 

25 biological sample and for diagnosis of diseases and conditions, which include, but are 
not limited to, immunological diseases, hereditary disorders involving the MHC class 
of immune molecules, as well as developmental disorders and reproductive disorders. 
Similarly, polypeptides and antibodies directed to these polypeptides are useful in 
providing immunological probes for differential identification of the tissue(s) or cell 

30 type(s). For a number of disorders of the above tissues or cells, particularly of the 
immune and reproductive system expression of this gene at significantly higher or 
lower levels may be routinely detected in certain tissues (e.g., cancerous and wounded 
tissues) or bodily fluids (e.g., serum, plasma, urine, synovial fluid or spinal fluid) or 
another tissue or cell sample taken from an individual having such a disorder, relative to 

35 the standard gene expression level, i.e., the expression level in healthy tissue or bodily 
fluid from an individual not having the disorder. Preferred epitopes include those 
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comprising a sequence shown in SEQ ID NO: 390* as residues: Ser-39 to Gly-46, Leu- 
49 to AJa-62. 

The tissue distribution and homology to collagen indicates that polynucleotides 
and polypeptides corresponding to this gene are useful for diagnosis and treatment of 
hereditary MHC disorders and particularly autoimmune disorders including rheumatoid 
arthritis, lupus, scleroderma, and dermatomyositis, as well as miany reproductive 
disorders, including cancer of the uterus, arid breast tissues. 

FEATURES OF PROTEIN ENCODED BY GENE NO: 158 

This gene is expressed primarily in the amygdala region of the brain. 
Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the assue(s) or cell type(s) present in a 
biological sample and for diagnosis of diseases and conditions, which include, but are 
not limited to, a variety of brain disorders, particularly those effecting mood and 
personality. Similarly, polypeptides and antibodies directed to these polypeptides are 
useful in providing immunological probes for differential identification of the tissue(s) 
or celUype(s). For a number of disorders of the above tissues or cells, particularly of 
the brain and central nervous system, expression of this gene at significantly hisher or 
lower levels may be routinely detected in certain tissues (e.g., cancerous and wounded 
tissues) or bodily fluids (e.g., serum, plasma, urine, synovial fluid or spinal fluid) or 
another tissue or cell sample taken from an individual having such a disorder, relative to 
the standard gene expression level, i.e., the expression level in healthy tissue or bodily 
fluid from an individual not having the disorder. 

The tissue distribution indicates that polynucleotides and polypeptides 
corresponding to this gene are useful for treatment and/or diagnosis of a variety of brain 
disorders, particularly bipolar disorder, unipolar depression, and dementia. 

FEATURES OF PROTEIN ENCODED BY GENE NO: 159 

This gene is expressed in a variety of tissues and cell types including brain, 
smooth muscle, kidney, salivary gland and T-cells. 

Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s) or cell type(s) present in a 
biological sample and for diagnosis.of diseases and conditions, which include, but are 
not limited to, cancers of a variety of organs including brain, smooth muscle, kidney, 
salivary gland and T-cells and cardiovascular disorders. Similarly, polypeptides and 
antibodies directed to these polypeptides >are useful in providing immunological probes 
for differential identification of the tissue(s) or ceil type(s). For a number of disorders 
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of the above tissues or cells, particularly of the central nervous, urinary, salivary, 
digestive, and immune systems, expression of this gene at significantly higher or lower 
levels may be routinely detected in certain tissues (e.g., cancerous and wounded 
tissues) or bodily fluids (e.g., serum, plasma, urine, synovial fluid or spinal fluid') or 
5 another tissue or cell sample taken from an individual having such a disorder, relative to 
the standard gene expression level, i.e., the expression level in healthy tissue or bodily 
fluid from an individual not having the disorder. 

The tissue distribution in brain, smooth muscle, and T-cells indicates that 
polynucleotides and polypeptides corresponding to this gene are useful for diagnosis of 

10 various neurological, and cardiovascular disorders, but not limited to cancer within the 
above tissues. Additionally the gene product may be used as a target in the 
immunotherapy of the cancer. Because the gene is expressed in cells of lymphoid 
origin, the natural gene product may be involved in immune functions. Therefore it may 
be also used as an agent for immunological disorders including arthritis, asthma, 

15 immune deficiency diseases such as AIDS, and leukemia. 

FEATURES OF PROTEIN ENCODED BY GENE NO: 160 

The translation product of this gene shares sequence homology with collagen 
which is thought to be important in cellular interactions, extracellular matrix formation, 

20 . and has been found to be an identifying determinant in autoimmune disorders. 

Moreover, this. gene shows sequence homology with the yeast protein, Slslp, an 
endoplasmic reticulum component, involved in the protein translocation process in 
Yeast Yarrowia lipolytica. (See Accession No. 1052S28; see also J. Biol. Chem. 271, 
1 1668-1 1675 (1996).) With mouse, this same region shows sequence homology with 

25 the heavy chain of kinesin. (See Accession No. 2062607.) Recently, suppression of the 
heavy chain of kinesin was shown to inhibits, insulin secretion from primary cultures of 
mouse beta-cells. (See Endocrinology 138 (5), 1979-1987 (1997).) Moreover, kinesin 
was found associated with drug resistance and cell immortalization. (See 468355.) 
Thus, it is likely that this gene also act as a genetic suppressor elements. 

30 This gene is expressed primarily in the greater omentum and to a lesser extent in 

a variety of organs and cell types including gall bladder, stromal bone marrow cells, 
lymph node, liver, testes, pituitary, and thymus. 

Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s) or cell type(s) present in a 

35 biological sample and for diagnosis of diseases and conditions, which include, but are 
not limited to, disorders of the endocrine, gastrointestinal, and immunological systems, 
including autoimmune disorders and cancers in a variety of organs and cell types. 
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Similarly, polypeptides and antibodies directed to these polypeptides are useful in 
providing immunological probes for differential identification of the tissue(s) or cell 
type(s). For a number of disorders of the above tissues or cells, particularly of the 
immune and gastrointestinal systems,expression of this gene at significantly higher or 
5 lower levels may be routinely detected in certain tissues (e.g., cancerous and wounded 
tissues) or bodily fluids (e.g., serum, plasma, urine, synovial fluid or spinal fluid) or 
another tissue or cell sample taken from an individual having such a disorder, relative to 
the standard gene expression level, i.e., the expression level in healthy tissue or bodily 
fluid from an individual not having the disorder. Preferred epitopes include those 
10 r comprising a sequence shown in SEQ ID NO: 393 as residues: Asn-27 to Leu-47, Gin- 
Si to Lys-88, Asp-93 to Lys-102, Asn-107 to Leu- 1 16, Met- 129 to Glu-141, Glu-150 
to Asp-157, Lys-176 to Glu-185, Glu-333 to Tyr-349, Cys-393 to Leu-403, Gln-423 
to Gly-429. 

The tissue distribution in within various endocrine and immunological tissues 
15 combined with the sequence homology to a conserved collagen motif indicates that . 
polynucleotides and polypeptides corresponding. to this gene are useful for the 
diagnosis of various autoimmune. disorders including, but not limited to, rheumatoid 
arthritis, lupus enhyematosus, scleroderma, dermatomyositis Because the gene is 
expressed in cells of lymphoid origin, the natural gene product may be involved in 
20 immune functions. Therefore it may be also used as an agent for immunological 

disorders including arthritis, asthma, immune deficiency diseases such as AIDS, and 
leukemia. 

FEATURES OF PROTEIN ENCODED BY GENE NO: 161 

25 This gene has homology to the tissue inhibitor of metalloproteinase 2. Such 

inhibitors are vital to proper regulation of metalloproteins such as collagenases (See 
Accession No. PI 6368). In addition, this gene maps to chromosome 17, and 
therefore, may be used as a marker in linkage analysis for chromosome 17 (See 
Accession No. PI 6368). 

30 This gene is expressed primarily in several types of cancer including 

osteoclastoma, chondrosarcoma, and rhabdomyosarcoma and to a lesser extent in 
several non-malignant tissues including synovium, amygdala, testes, placenta. 

Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s) or cell type(s) present in a 

35 • biological sample and for diagnosis of diseases and conditions, which include, but are 
not limited to, various types of cancer, particularly cancers of bone and cartilage, as 
well as various autoimmune disorders. Similarly, polypeptides and antibodies' directed 
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to these polypeptides are useful in providing immunological probes for differential 
identification of the tissue(s) or cell type(s). For a number of disorders of the above 
tissues or cells, particularly of the musculoskeletal system, expression of this gene at 
significantly higher or lower levels may be routinely detected in certain tissues (e.g., 

5 cancerous and wounded tissues) or bodily fluids (e.g., serum, plasma, urine, synovial 
fluid or spinal fluid) or another tissue or cell sample taken "from an individual having 
such a disorder, relative to the standard gene expression level, i.e., the expression level 
in healthy tissue or bodily fluid from an individual not having the disorder. 

The tissue distribution in various cancers and the sequence homology to a * 

0 - collagenase inhibitor indicates that polynucleotides and polypeptides corresponding to 
this gene are useful for detection of various autoimmune disorders such as rheumatoid 
arthritis, lupus, scleroderma^ and dermatomyositis. Therefore it may be also used as an 
agent for immunological disorders including arthritis, asthma, immune deficiency 
diseases such as AIDS, and leukemia. 

15 

FEATURES OF PROTEIN ENCODED BY GENE NO: 162 

This gene is homologous tp the mitochondrial ATP6 gene and therefore is likely 
a homolog of this gene family (See Accession No. X76197). 
This gene is expressed primarily in brain tissue. 

20 Therefore, polynucleotides and polypeptides of the invention are useful as 

reagents for differential identification of the tissue(s) or ceil type(s) present in a 
biological sample and for diagnosis of diseases and conditions, which include, but are 
not limited to, a variety of brain disorders, including Down's syndrome, depression, 
Schizophrenia, and epilepsy. Similarly, polypeptides and antibodies directed to these 

25 polypeptides are useful in providing immunological probes for differential identification 
of the tissue(s) or cell type(s). For a number of disorders of the above tissues or cells, 
particularly of the central nervous system, expression of this gene at significantly higher 
or lower levels may be routinely detected in certain tissues (e.g., cancerous and 
wounded- tissues) or bodily fluids (e.g., serum, plasma, urine, synovial fluid or spinal 

30 fluid) or another tissue or cell sample taken from an individual having such a disorder, 
relative to the standard gene expression level, i.e., the expression level in healthy tissue 
or bodily fluid from an individual not having the disorder. 

The tissue distribution in brain tissue, indicates this gene is useful for diagnosis 
of various neurological disorders including, but not limited to, brain cancer. 

35 Additionally the gene product may be used as a target in the immunotherapy of cancer in 
the brain as well as for the diagnosis of metabolic disorders such as obesity T ay-Sachs 
disease, phenylketonuria and Hurler's Syndrome. 
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FEATURES OF PROTEIN ENCODED BY GENE NO: 163 

This, gene is expressed primarily in placenta, neutrophils, and microvascular 
endothelial cells and to a lesser extent in multiple tissues including brain, prostate, 

5 spleen, thymus, and bone. 

Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s) or cell type(s) present in a 
biological sample and for diagnosis of diseases and conditions, which include, but are 
not limited to, neutropenea and other diseases of the immune system. Similarly-, 

10 polypeptides and antibodies directed to these polypeptides are useful in providing 

immunological probes for differential identification of the tissue(s) or cell type(s). For 
a number of disorders of the above tissues or cells, particularly of the immune system, 
expression of this gene at significantly higher or lower leveis may be routinely detected 
in certain tissues (e.g., cancerous and wounded tissues) or bodily fluids (e.g.. serum, - 

15 plasma, urine, synovial fluid or spinal fluid) or another tissue or cell sample taken from 
an individual having such a disorder, relative to the standard gene expression level, i.e., 
the expression level in healthy tissue or bodily fluid from an individual not having the 
disorder. 

The tissue distribution in placenta indicates that polynucleotides and 
20 polypeptides corresponding to this gene are useful for diagnosis various female 

reproductive disorders. Additionally the gene product may be used as a target in the 
immunotherapy of various cancers. Because the gene is expressed in some cells of 
lymphoid and endocrine origin, the natural gene product may be involved in immune 
functions and metabolism regulation, respectively. Therefore it may be also used as an 
25 agent for immunological disorders including arthritis, asthma, immune deficiency 
diseases such as AIDS, and leukemia. 

FEATURES OF PROTEIN ENCODED BY GENE NO: 164 

This gene is expressed primarily in neutrophils, monocytes, bone marrow, and 
30 fetal liver. 

Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the ussue(s) or cell type(s) present in a 
biological sample and for diagnosis of diseases and conditions, which include, but are 
not limited to, immune system disorders including, but not limited to, autoimmune 
35 disorders such as lupus, and immunodeficiency disorders . Similarly, polypeptides and 
antibodies directed to these polypeptides are useful in providing immunological probes 
for differential identification of the tissue(s) or cell type(s). For a number of disorders 
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of the above tissues or cells, particularly of the immune system, expression of this gene 
at significantly higher or lower levels may be routinely detected in certain tissues (e.g., 
cancerous and wounded tissues) or bodily fluids (e.g., serum, plasma, urine, synovial 
• fluid or spinal fluid) or another tissue or cell sample taken from an individual having 

5 such a disorder, relative to the standard gene expression, level, i.e., the expression level 
in healthy tissue or bodily fluid from an individual not having the disorder. 

The tissue distribution in various immune system tissue indicates that 
polynucleotides and polypeptides corresponding to this gene are useful for the 
diagnosis of various immunological disorders such as-Hodgkin's lymphoma, arthritis, 

10 asthma, immune deficiency diseases such as AIDS, and leukemia. 

FEATURES OF PROTEIN ENCODED BY GENE NO: 165 

The translation product of this gene shares sequence homology with dystrophin 
which is thought to be defective in both Duchene and Becker Muscular Dystrophy. 

15 Preferred polypeptide fragments comprise the following amino acid sequence: 

MKLLGECSSSmSVKRLEHKLKEEEESLPGFVNXHSTETQTAGVIDRWELLQAQ 

ALSKELRMKQ^QKWQQFNSD 

IKKLKELQKAV]^ 

CSLLEEWRGLLQDALMQCQGFH^ 

20 QDHHKQLMQIKHELLESQLRVASLQDMSCQLLVN.AEGTDCLEAKEKVHVIGNR 
LKLLLKEVSRHIKELEKLLDVSSSQQDLSSWSSADELDTSGSVSPXSGRSTPNR 
QKTPRGKCSLSQPGPSVSSPHSRSTKGGSDSSLSEPXPGRSGRGFLFRVLRAA 
LPLQLLLLLLIGLACLVPMSEEDYSCALSNNFARSFHPMLRYTNGPPPL (SEQ ID 
NO:726); MKLLGECSSSIDSVKRLEHKLKEEEESLPGFVNLHSTETQTAGVIDR 

25 WELLQAQALSKELRMKQNLQ 

TDIQTEELQIK (SEQ ID NO:727); KLKELQKL^VDHRICAIILSINLCSPEFTQADSK 

ESRI)LQDRLXQMNGRWDRVCSLI^ 

ENroRRKNEIWIDSNTDAEILQDHH 

(SEQ ID NO:728); QDMSCQLLVNAEGTDCLEAKEKVHVIGNRLKLLLKEVS 
30 RHIKELEKLLDVSSSQQDLSSWSSADELDTSGSVSPXSGRSTPNRQKTPRGKCS 
LSQPGPSVSSPHS (SEQ ID NO:729); DSSLSEPXPGRSGRGFLFRVLRAAL ■ 
PLQLLLLLLIGLACLVPMSEEDYSCALSNNFARSFHPMLRYTNGPPPL (SEQ ID 
NO:730). Also prefeiTed are polynucleotide fragments encoding these polypeptide 
fragments. Furthermore, this gene maps to chromosome 6, and therefore, may be used 
35 as a marker in linkage analysis for chromosome 6 (See Accession No. N62896). 

This gene is expressed in numerous tissues including the heart, kidney, and 

brain. 
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Therefore, polynucleotides and polypeptides of the invention are useful as 
re.agents for differential identification of the tissue(s) or cell type(s) present in a 
biological sample and for diagnosis of diseases and conditions, which include, but are 
not limited to, musculoskeletal disorders including Muscular Dystrophy and 
5 cardiovascular diseases. Similarly, polypeptides and antibodies directed to these 

polypeptides are useful in providing immunological probes for differential identification 
of the tissue(s) or cell type(s). For a number of disorders of the above tissues or cells, 
particularly of the muscle tissues, expression of this gene at significandy higher or 
lower levels may be routinely detected in certain tissues (e.g., cancerous and wounded 

10 tissues) or bodily fluids (e.gJ, serum, plasma, urine, synovial fluid or spinal fluid) or 
another tissue or cell sample taken from an individual having such a disorder, relative to 
the standard gene expression level, i.e., the expression level in healthy tissue or bodily 
fluid from an individual not having the disorder. 

The tissue distribution and homology to dystrophin indicates that 

15 polynucleotides and polypeptides corresponding to this gene are useful for the 
diagnosis and treatment of Muscular, Dystrophy and other muscle disorders. 

FEATURES OF PROTEIN ENCODED BY GENE NO: 166 
This gene is expressed primarily in human cerebellum. 

20 Therefore, polynucleotides and polypeptides of the invention are useful as 

reagents for differential identification of the tissue(s) or cell type(s) present in a 
. biological sample and for diagnosis of diseases and conditions, which include, but are 
not limited to, diseases of the central nervous system, including Alzheimer's Disease, 
Parkinson's Disease, ALS, and mental illnesses. Similarly, polypeptides and antibodies 

25 directed to these polypeptides are useful in providing immunological probes for 

differential identification of the tissue(s) or cell type(s). For a number of disorders of 
the above tissues or cells, particularly "of the central nervous system, expression of this 
gene at significantly higher or lower levels may be routinely detected in certain tissues 
(e.g., cancerous and wounded tissues) or bodily fluids (e.g., serum, plasma, urine, 

30 synovial fluid or spinal fluid) or another tissue or cell sample taken from an individual 
having such a disorder, relative to the standard gene expression level, i.e., the 
expression level in healthy tissue or bodily fluid from an individual not having the 
disorder. Preferred epitopes include those comprising a sequence shown in SEQ ID 
NO: 399 as residues: Pro-20 to Gly-26, Leu-37 to Pro-42, His-57 to Gly-63. 

35 . The dssue distribution indicates that the protein products of this gene, are useful 

for treatment/diagnosis of diseases of the central nervous system and may protect or 



WO 9S/54963 



PCT/US9S/11422 



131 

enhance survival of neuronal cells by slowing progression or neurodegenerative 
diseases. 

FEATURES OF PROTEIN ENCODED BY GENE NO: 167 
.5 Preferred polypeptides encoded by this gene comprise the following amino acid 

sequence: 

MKLLICGN^XAPSHSESSRRCCLLCFYPLCLEINFGMKVFLSMPFLVLFQ 
SLIQED (SEQ ID NO:731). Polynucleotides encoding such polypeptides are also 
provided. This gene is believed to reside on chromosome 15. Therefore polynucleotides 

10 derived from this gene are useful in linkage analysis as chromosome 15 markers. 

This gene is expressed primarily in human testes tumor and to a lesser extent in 
normal human testes. 

Therefore, polynucleotides and polypeptides of the invention are useful as 
s reagents for differential. identification of the tissue(s) or cell type(s) present in a 

15 biological sample and for diagnosis of diseases and conditions, which include, but are 
not limited to, diseases of the testes, particularly cancer, and other reproductive 
disorders. Similarly, polypeptides and antibodies-directed to these polypeptides are 
useful in providing immunological probes for differential identification of the tissue(s) 
or cell type(s). For a number of disorders of the above tissues or cells, particularly of 

20 the male reproductive tissues, expression of this gene at significantly higher or lower 
levels may be routinely detected in certain tissues (e.g., cancerous and wounded 
tissues) or bodily fluids (e.g., serum, plasma, urine, synovial fluid or spinal fluid) or 
another tissue or cell sample taken from an individual having such a disorder, relative to 
the standard gene expression level, i.e., the expression level in healthy tissue or bodily 

25 fluid from an individual not having the disorder. 

The tissue distribution indicates that the protein products of this gene are useful 
for treatment/diagnosis of testicular diseases including cancers. 

FEATURES OF PROTEIN ENCODED BY GENE NO: 16S 

30 This gene is expressed primarily in fetal liver. 

Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s) or cell type(s) present in a 
biological sample and for diagnosis of diseases and conditions, which include, but are 
not limited to, conditions affecting hematopoietic development and metabolic diseases. 

35 Similarly, polypeptides and antibodies directed to these polypeptides are useful in 
providing immunological probes for differential identification of the tissue(s) or cell 
type(s). For a number of disorders of the above tissues or cells, particularly of the ; 
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hepatic system, and fetal hematopoietic system, expression of this gene at significantly 
higher or lower levels may be routinely detected in certain tissues (e.g., cancerous and 
wounded tissues) or bodily fluids (e.g., serum, plasma, urine, synovial fluid or spinal 
fluid) or another tissue or cell sample taken from an individual having such a disorder, 
5 relative to the standard gene expression level, i.e., the expression level in healthv tissue 
or bodily fluid from an individual not having the disorder. Preferred epitopes include 
those comprising a sequence shown in SEQ ID NO: 401 as residues: His-7 to Trp-17, 
Leu- 19 to Lys-27, Pro-33 to Gly-44, Lys-68 to Gly-74, Lys-85 to Cys-95. 

The tissue distribution indicates that the protein products of this gene are useful 
10 for treatment/diagnosis of diseases of the developing liver and hematopoietic system, 
and act -as a growth differentiation factor for hematopoietic stem cells. 

FEATURES OF PROTEIN ENCODED BY GENE NO: 169 

• The polypeptide encoded by this gene is believed to be a membrane bound 
15 receptor. The extracellular domain of which is expected to consist of the following 
amino acid sequence: 
RILLVKYSANEENKYDYLPT^ 
ASWKEFSDFMKWSIPAFLYF^ 

LKXRLNWIQWASLLTLFLSIVALTAGTKTLQHNLAGRGFHHDAFFSPSNSCLL 
20 FRNECPRKDNCTAKEWTFPEAK\V7^ 

YNEKILKEGNQLTEXinQNSKLYFFGILFNGLTLGLQRSNRDQIKNCGFFYGH 
S (SEQ ID NO:732). Thus, preferred polypeptides encoded by this gene comprise the 
extracellular domain as shown above. It will be recognized, however, that deletions of 
either end of the extracellular domain up to the first cysteine from the N-terminus and 

25 the first cysteine of the C-terminus, is expected to retain the biological functions of the 
full-length -extracellular domain because the cysteines are thought to be responsible for 
providing secondary structure to the molecule. Thus, deletions of one. or more amino 
acids from either end (or both ends) of the extracellular domain are contemplated. Of 
course, further deletions including the cysteines are also contemplated as useful as such 

30 polypeptides is expected, to have immunological properties such as the ability to evoke 
and immune response. Polynucleotides encoding all of the foregoing polypeptides are 
provided. 

This gene is expressed primarily in human osteoclastoma and to a lesser extent 
in hippocampus and chondrosarcoma. 
35 Therefore, polynucleoddes and polypeptides of the invention are useful as 

reagents for differential identification of the tissue(s) or cell type(s) present in a 
biological sample and for diagnosis of diseases and conditions, which include, but are 



WO 98/54963 



\ PCT/US9S/11422 



133 

not lirruced to. cancers, panicularly those of the bone and connective tissues. Similarly, 
polypeptides and antibodies directed to these polypeptides are useful in providing 
immunological probes for differential identification of the tissue(s) or cell type(s). For 
a number of disorders of the- above tissues or cells, particularly of the skeletal system, 
5 expression of this gene at significantly higher or lower levels may be routinely detected 
in certain tissues (e.g., cancerous and wounded tissues) or bodily fluids (e.g., serum, 
plasma, urine, synovial fluid or spinal fluid) or another tissue or cell sample taken from 
an individual having such a disorder, relative to the standard gene expression level, i.e., 
the expression level in healthy tissue or bodily fluid from an individual not having the 
10 disorder. Preferred epitopes include those comprising a sequence shown in SEQ ID 
NO: 402 as residues: Met-1 to Cys-6, Ala-41 to Tyr-49, Lys-76 to Lys-84. 

The tissue distribution indicates that the protein products of this gene are useful 
for diagnosis of cancers of the bone and connective tissues, and may act as growth 
factors for cells involved in bone or connective tissue growth. 

FEATURES OF PROTEIN ENCODED BY GENE NO: 170 

Preferred polypeptides encoded by this gene comprising the following amino 
acid sequence: 

NSVPNLQTLAVLTEAIGPEPAIPRXPREPPVATSTPATPSAGPQPLPTGTV 

20 LVPGGPAPPCLGEAWALLLPPCRPSLTSCFWSPRPSPWKETGV (SEQ ID 
NO:733). Polynucleotides encoding such polypeptides are also provided herein. 
This gene is expressed primarily in hematopoietic progenitor cells. 
Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differencial identification of the tissue(s) or cell type(s) present in a 

25 biological sample and for diagnosis of diseases and conditions, which include, but are 
not limited to, diseases of the' blood including cancer and autoimmune disorders. 
Similarly, polypeptides and antibodies directed to these polypeptides are useful in 
providing immunological probes for differential identification of the tissue(s) or cell 
type(s). For a number of disorders of the above tissues or cells, particularly of the 

30 blood/circulatory system, expression of this gene at significantly higher or lower levels 
may be routinely detected in certain tissues (e.g., cancerous and wounded tissues) or 
bodily fluids (e.g., serum, plasma, urine, synovial fluid or spinal fluid) or another 
tissue or cell sample taken from an individual having such a disorder, relative to the 
standard gene expression level, i.e., the expression level in healthy tissue or bodily 

35 fluid from an individual not having the disorder. Preferred epitopes include those 

comprising a sequence shown in SEQ ID NO: 403 as residues: Gln-4 to His- 10, Pro-25 
to His-32. 
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The cissue distribution indicates that the protein products of this gene are useful 
for diagnosis of diseases involving growth differentiation of hematopoietic cells. 

FEATURES OF PROTEIN ENCODED BY GENE NO: 171 

Preferred polypeptides encoded by this gene comprise the following amino acid 
sequences: ALQLAFYTDAVEEWLEENVHPSLQEU.QXLLQDLSEVSAPP (SEQ ID 
NO:734); and/or CHPPALAGTLLRTPEGRAHARGLLLEAGGA (SEQ ID NO:735). 
Polynucleotides encoding such polypeptides are also provided. The protein product of 
this gene shares sequence homology with metallothionines. Thus, polypeptide encoded 
by this gene are expected to have metallothionine activity, such activities are known in 
the art and described elsewhere herein. 

This gene is expressed primarily in kidney cortex. 

Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s) or cell type(s) present in a 
biological sample and for diagnosis of diseases and conditions, which include, but are 
not limited to, diseases of the kidney including cancer and renal dysfunction. Similarly, 
polypeptides and antibodies directed to these polypeptides are useful in providing 
immunological probes for differential identification of the tissue(s) or celftype(s). For 
a number of disorders of the above tissues or cells, particularly of the renal system, 
expression of this gene at significantly higher or lower levels may be routinely detected 
in certain tissues (e.g., cancerous and wounded tissues) or bodily fluids (e.g., serum, 
plasma, urine, synovial fluid or spinal fluid) or another tissue or cell sample taken from 
an individual having such a disorder, relative to the standard gene expression level, i.e., 
the expression level in healthy tissue or bodily fluid from an individual not having the 
disorder. Preferred epitopes include those comprising a sequence shown in SEQ ID 
NO: 404 as residues: Ser-47 to Gln-52. 

The tissue distribution indicates that polynucleotides and polypeptides 
corresponding to this gene are useful for treatment/diagnosis of diseases of the kidney 
including kidney failure. 

FEATURES OF PROTEIN ENCODED BY GENE NO: 172 

This gene is expressed primarily in 12 week old early stage human. 

Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s) or cell type(s) present in a ' 
biological sample and for diagnosis of diseases and conditions, which include, but are 
not limited to, developmental abnormalities. Similarly; polypeptides and antibodies 
directed to these polypeptides are useful in providing immunological probes for 
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_ differential identification of the tissue(s) or cell type(s). For a number of disorders of 
the above tissues or cells, particularly of the developing embryo, expression of this 
gene at significantly higher or lower levels may be routinely detected in certain tissues 
(e.g.. cancerous and wounded tissues) or bodily fluids (e.g. ? serum, plasma, urine, 
5 synovial Fluid or spinal fluid) or another tissue or cell sample taken from an individual 
having such a disorder, relative to the standard gene expression level, i.e., the 
expression ievel'in healthy tissue or bodily fluid from an individual not having the 
disorder. Preferred epitopes include those comprising a sequence shown in SEQ ID 
NO: 405 as residues: GIn-31 to Thr-43, Gly-51 to Ser-58, Pro-65 to Pro-72. 
10 The tissue distribution indicates that polynucleotides and polypeptides 

corresponding to this gene are useful for treatment/diagnosis of developmental 
problems with fetal tissue. The gene may be involved in vital organ development in the 
early stage, especially hematopoiesis, cardiovascular syst em, and neura l develop ment. 

15 FEATURES OF PROTEIN ENCODED BY GENE NO: 173 

The translation product of this gene shares sequence homology withTGN38, an 
integral membrane protein previously shown to be predominantly localized to the trans- 
.Golgi network (TGN) of cells. 

This gene is expressed primarily in developing embryo and to a lesser extent in 
20 cancer tissues including lymphoma, endometrial, protate and colon. 

Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s) or cell type(s) present in a 
biological sample and for diagnosis of diseases and conditions, which include, but are 
not limited to, developmental abnormalities and cancer. Similarly, polypeptides and 
25 antibodies directed to these polypeptides are useful in providing immunological probes 
for differential identification of the tissue(s) or cell type(s). For a number of disorders 
. of the above tissues or cells, particularly of the developing fetus, expression of this 
gene at significantly higher or lower levels may be routinely detected in certain tissues 
(e.g., cancerous and wounded tissues) or bodily fluids (e.g., serum, plasma, urine, 
30 synovial fluid or spinal fluid) or another tissue or cell sample taken from an individual 
having such a disorder,, relative to the standard gene expression level, i.e., the 
expression level in healthy tissue or bodily fluid from an individual not having the 
disorder. Preferred epitopes include those comprising a sequence shown in SEQ ID 
NO: 406 as residues:' His-65 to Ser-72, Pro-82 to Gly-91, Pro-98 to Glu-1 18, Ser-126 
35 to Gly-166, Pro-180 to Asp-lS8, Tyr-209 to Lys-214, Gln-220 to Leu-228. 

The tissue distribution and homology to an integral membrane protein indicates 
that polynucleotides and polypeptides corresponding to this gene are useful for 
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diagnosis of cancers and developmental abnormalities where aberrant expression relates 
to an abnormality. 

FEATURES OF PROTEIN ENCODED BY GENE NO: 174 

5 The translation product of this gene shares sequence homology with a dnaJ heat 

shock protein from E. coii which is allelic to sec63, a gene that affects transit of nascent 
secretory proteins across the endoplasmic reticulum in yeast. 

This gene is expressed primarily in Hodgkin's lymphoma and to a lesserextent 
in testes. 

10 Therefore, polynucleotides and polypeptides- of the invention are useful as 

reagents for differential identification of the tissue(s) or cell type(s) present in a 
biological sample and for diagnosis of diseases and conditions, which include, but are 
not limited to, cancer. Similarly, polypeptides and antibodies directed to these 
polypeptides are useful in providing immunological probes for differential identification 

15 of the tissue(s) or cell type(s). For a number of disorders of the above tissues or cells, 
particularly of the immune system, expression of this gene at significantly higher or 
lower levels may be routinely detected in certain tissues (e.g., cancerous and wounded 
tissues) or bodily fluids (e.g., serum, plasma, urine, synovial fluid or spinal fluid) or 
another tissue or cell sample taken from an individual having such a disorder, relative to 

20 the standard gene expression level, i.e., the expression level in healthy tissue or bodily 
fluid from an individual not having the disorder. Preferred epitopes include those 
comprising a sequence shown in SEQ ID NO: 407 as residues: Thr-13 to Trp-21, Arg- 
74 to Asp-8 1. 

The tissue distribution and homology to dnaJ indicates that polynucleotides and 
25 polypeptides corresponding to this gene are useful as a diagnostic for cancer including . 
Hodgkin's lymphoma. 

FEATURES OF PROTEIN ENCODED BY GENE NO: 175 

This gene is expressed primarily in endothelial cells and to a lesser extent in 
30 bone marrow stromal cells. 

Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s) or cell type(s) present in a 
biological sample and for diagnosis of diseases and conditions, which include, but are 
not limited to/ diseases involving angiogenic abnormalities including diabetic 
35 retinopathy, macular degeneration, and other diseases including arteriosclerosis and 

cancer. Similarly, polypeptides and antibodies directed to these polypeptides are useful 
in providing immunological probes for differential identification of the tissue(s) or cell 
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type(s). For a number of disorders of the above tissues or cells, particularly of the 
vascular system, expression of this gene at significantly higher or lower levels may be 
routinely detected in certain tissues (e.g., cancerous and wounded tissues) or bodily 
fluids (e.g., serum, piasma, urine, synovial fluid or spinal fluid) or another tissue or 

5 cell sample taken from an individual having such a disorder, relative to the standard 

gene expression level, i.e., the expression level in healthy tissue or bodrly fluid from an 
individual not having the disorder. 

The tissue distribution indicates that the protein products of this gene are useful 
for treating diseases where an increase or decrease in angiogenesis is indicated and as a 

10 factor in the wound healing process. 

FEATURES OF PROTEIN ENCODED BY GENE NO: 176 

The translation product of this gene shares sequence homology with MATS . 
(mouse) which is thought to be important in regulating chloride conductance in cells 
1 5 (particularly in the breast) by modulating the response mediated by cAMP and protein 
kinase C to extracellular signals. 

This gene is expressed primarily in amniotic cells and hematopoeitic cells 
including macrophages, Neutrophils, T cells. TNF in duced ao rtic endothelium and to a 
lesser extent in testes, TNF induced epithelial cells, and smooth muscle. 
20 Therefore, polynucleotides and polypeptides of the invention are useful as 

reagents for differential identification of the tissue(s) or cell type(s) present in a , 
biological sample and for diagnosis of diseases and conditions, which include, but are 
not limited to, inflammatory responses mediated by T cells, macrophages, and/or 
neutrophils particularly those involving TNF, and also cancer. Similarly, polypeptides 
25 and antibodies directed to these polypeptides are useful in providing immunological 
probes for differential identification of the tissue(s) or cell type(s). For a number of 
disorders of the above tissues or cells, particularly of the immune system, expression of 
this gene at significandy higher or lower levels may be routinely detected in certain . 
tissues (e.g., cancerous and wounded tissues) or bodily fluids (e.g., serum, plasma, 
30 urine, synovial fluid or spinal fluid) or another tissue or cell sample taken from an 

individual having such a disorder, relative to the standard gene expression level, i.e., 
the expression level in healthy tissue or bodily fluid from an individual not having the 
disorder. Preferred epitopes include those comprising a sequence shown in SEQ ID 
NO: 409 as residues: Thr-19 to Ala-33, Leu-54 to Asp-82, Pro-89 to Ala-97, Pro- 100 
35 to Lys-125, Ser-127 to Phe-135, Gly-164 to Leu-169, Cys-173 to Arg-178. 

The tissue distribution and homology to mat-8 indicates that polynucleotides and 
polypeptides corresponding to this gene are useful for modifying inflammatory 
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responses to cytokines such as TNF and chus modifying che duration and/or severity of 
inflammation. Polynucleotides and polypeptides derived from this gene are thought to 
be useful in the diagnosis and treatment of cancer 

5 FEATURES OF PROTEIN ENCODED BY GENE NO: 177 

Tnis gene is expressed primarily in endothelial ceils. 
Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of che tissue(s) or cell type(s). present in a 
biological sample and for diagnosis of diseases and conditions, which include, but are 

10 not limited to, vascular restenosis. Similarly, polypeptides and antibodies directed to 
these polypeptides are useful in providing immunological. probes for differential 
identification of the tissue(s) or cell type(s). For a number of disorders of the above 
tissues or cells, particularly of the vascular system, expression of this, gene at 
significantly higher or lower levels may be routinely detected in certain tissues (e.g., 

15 cancerous and wounded tissues) or bodily fluids (e.g., serum, plasma, urine, synovial 
fluid or spinal fluid) or another tissue or cell sample taken from an individual having 
such a disorder, relative to the standard gene expression level, i.e., the expression level 
in healthy tissue or bodily fluid from an individual not having the disorder. 

The tissue distribution indicates that polynucleotides and polypeptides 

20 corresponding to this gene are useful for treating diseases associated with vascular 
response to injury such as vascular restenosis following angioplasty.. 

FEATURES OF PROTEIN ENCODED BY GENE NO: 178 

One embodiment of the claimed invention comprises: 
25 MRPDWKAGAGPGGPPQKPAPSSQRKPPARP^ 

RLRLEEDKPAVERCIJEELWGDVENT)EDALLRRLRGPRVQEHEDSGDSE 
KGNFPPQKKPVWVDEEDED^ 

RLKEEFQHA^IGGVPAWAETTKRKTSSDDESEEDEDDLLQRTGNHSTSTSLPRG 
ILKMKNCQHANAERPTVARISICAVPSRCTDCDGCWD (SEQ ID NO:737); or 
30 CLEELVTGD VENDED ALLRRLRGPRVQEffi 
WVDEEDEDEEMVDMMNN^ 

GVPAW.^ETTKRKTSSDDESEEDEDDLLQRTGNnSTSTSLPRGELKNIKNCQHA 
NAERPTVAR1SICAVPSRCTDCDGC (SEQ ID NO: 738). LKEKIVRSFEVSPDGS 
F1JLINGL\GYLHLLAMKTK^ 
35 WDVNSRKCLNRF\ 7 DEGSLYGLSL\TSRNGQYVACGSNCGVVNIYNQDSCLQE 
TNPKPIKAIMhH-VTGVTSLTFN 

KNKNISHVHTMDFSPRSGYF,\LGNEKG1C\LMYRLHHYSDF (SEQ ID NO:739); 
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and/or KINGRV.\ASTFSSDSKKVYASSGDGEVYVWDVNSRKCU^RFYDEGSL 
YGLSIATSRNGQYVACGSNCGVV^YNQDSCLQETNPKPD 
FNPTTEILAIASEKMKEAVRLVHLPSCT^ 
YFALGNEKGKAL (SEQ ID NO:740). 
5 This gene is expressed primarily in epidydimus and endometrial tumors and to a 

lesser extent in T cell lymphoma and cell lines derived from colon cancer. 

Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s) or ceil type(s) present in a 
biological sample and for diagnosis of diseases and conditions, which include, but are 

10 not limited to, tumors of the reproductive organs including testis and endometrial cells. 
Similarly, poiypepddes and antibodies directed to these polypeptides are useful in 
providing immunological probes for differential identification of the tissue(s) or* cell 
type(s). For a number of disorders of the above tissues or cells, particularly of the 
reproductive system, expression of this gene at significantly higher or lower levels may 

15 be routinely detected in certain tissues (e.g., cancerous and wounded tissues) or bodily 
fluids (e.g., serum, plasma, urine, synovial fluid or spinal fluid) or another tissue or 
cell sample taken from an individual having such a disorder, relative to the standard 
gene expression level, i.e., the expression level in healthy tissue or bodily fluid from an 
individual not having the disorder. Preferred epitopes include those comprising a 

20 sequence shown in SEQ ID NO: 41 1 as residues: Ser-67 to Lys-72, Val-87 to Leu-93, 
Tyr-128 to Pro-141 , Asp-204 to Giy-210. 

The tissue distribution indicates that the protein products of this gene are useful 
for treating tumors of the endometrium or epithelial tumors of the reproductive system. 

25 FEATURES OF PROTEIN ENCODED BY GENE NO: 179 

Preferred polypeptides encoded by this gene comprise the following amino acid 
sequence: 

MRILQLILLALATGLVGGETRIIKGFECKLHSQPWQAALFEKTRLLCGATLIAPR 

WLLTAAHCLKPRYIVHLGQHNLQKEEGCEQTRTATESFPHPGFNNSLPNKDH 
30 RND1MLVKMASPVSITWAW . 

D.APTSPSLSTRSVTITPTPATSQTPWCVPACRKG.ARTPARVTPG^WSVTSLFIC^ 

LSPGARIRVRSPESLVSTRKSANMWTGSRRR (SEQ ID NO:741); ETRHKGFEC 

KLHSQPWQAALFEKTRLLCGATLIAPR 

GCEQTRTATESFPHPGFNNSLPNKDHRNfDLML^ 
35 CVTAGTSCSFPAG^RPDP-SYACLTPCDAPTSPSLSTRSVRTPTPATSQTPWCVP 

ACRKGARTPARVTPGALWSVTSLFKALSPG 
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SRRR (SEQ ID NO:742); or CKLHSQPWQAALFEKTRLLCGATLLAPRWLLT 
AAHCLKPRYIVHLGQHNLQKEEGCEQTRTATESFPHPGFNS 
(SEQ ID NO:743). The translation product of this gene shares sequence homology 
with neuropsin a novel serine protease which is thought to be important in modulating 
5 extracellular signaling pathways in the brain. Owing to the structural similarity to other - 
serine proteases the protein products of this gene are expected to have serine protease 
activity which may be assayed by methods known in the an and described elsewhere 
herein. 

This gene is expressed primarily in endometrial tumor and to a lesser extent in 

10 . colon cancer, benign hypertrophic prostate, and thymus. 

Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s) or cell type(s) present in a 
biological sample and for diagnosis of diseases and conditions, which include, but are 
not limited to, cancers of the endometrium or colon and benign hypertrophy of the 

15 prostate. Similarly, polypeptides and antibodies directed to these polypeptides are 

useful in providing immunological probes for differential identification of the tissue(s) 
or cell type(s). For a number of disorders of the above tissues or cells, particularly of 
the urogenital or reproductive systems, expression of this gene at significantly higher or 
lower levels may be routinely detected in certain tissues (e.g., cancerous and wounded 

20 tissues) or bodily fluids (e.g., serum, plasma, urine, synovial fluid or spinal fluid) or 
another tissue or cell sample taken from, an individual having such a disorder, relative to 
the standard gene expression level, i.e., the expression level in healthy tissue or bodily 
fluid from an individual not having the disorder. Preferred epitopes include those 
comprising a sequence shown in SEQ ID NO: 412 as residues: Gly-12 to Ser-22, Pro- 

25 34 to Ser-53. 

The tissue distribution and homology to serine proteases indicates that 
polynucleotides and polypeptides corresponding to this gene are useful for diagnosing 
or treating hperproliferative disorders such as cancer of the endometrium or colon and 
hyperplasia of the prostate. 

30 

FEATURES OF PROTEIN ENCODED BY GENE NO: 180 

Preferred polypeptide encoded by this gene comprise the following amino acid 
sequence: VLQGRYFSPILEMRRLRPEGXXNLPGGSRAQKEPRQDLTLV'LWPHC 
PHFAMTRSYVPTKQCMVQGSFYCIFIFKGPVQNWC (SEQ ID NO:744). 
35 Polynucleotides encoding such polypeptide are also provided. 
This gene is expressed primarily in fetal brain 
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Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s) or cell type(s) present in a 
biological sample and for diagnosis of diseases and conditions, which include, but are . 
not limited to. identifying and expanding stem cells in the CNS. Similarly, polypeptides 
5 and antibodies directed to these polypeptides are useful in providing immunological 
probes for differential identification of the tissue(s) or cell type(s). For a number of 
disorders of the above tissues or cells-, particularly of the nervous system, expression of 
this gene at significantly higher or lower levels may be routinely detected in certain 
tissues (e.g., cancerous and wounded tissues) or bodily fluids (e.g., serum, plasma, 
10 urine, synovial fluid or spinal fluid) or another tissue or cell sample taken from an 

individual having such a disorder, relative to the standard gene expression level, i.e., . 
the expression level in healthy tissue or bodily fluid from an individual not having the 
disorder. 

The tissue distribution indicates that the protein products of this gene are useful 
15 for detecting and expanding stem cell populations in the (or of the) central nervous 
system. 

FEATURES OF PROTEIN ENCODED BY GENE NO: 181 

This gene is expressed primarily in early stage human brain and a stromal cell 

20 line. 

Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s) or cell type(s) present in a 
biological sample and for diagnosis of diseases and conditions, which include, but are 
not limited to, developmental abnormalities of the CNS. Similarly, polypeptides and 

25 antibodies directed to these polypeptides are useful in providing immunological probes 
for differential identification of the tissue(s) or cell type(s). For a number of disorders 
of the above tissues or cells, particularly of the central nervous system, expression of 
this gene at significantly higher or lower levels may be routinely detected in certain 
tissues (e.g., cancerous and wounded tissues) or bodily fluids (e.g., serum, plasma, 

30 urine, synovial fluid or spinal fluid) or another tissue or cell sample taken from an 

individual having such a disorder, relative to the standard gene expression level, i.e., 
the expression level in healthy tissue or bodily fluid from an individual not having the 
disorder. Preferred epitopes include those comprising a sequence shown in SEQ ID 
NO: 414 as residues: Gin-42 to Gln-47, Gln-54 to Pro-60. 

35 The tissue distribution indicates that the protein products of this gene play a role 

in the development of the central nervous system. Therefore this gene and its products 
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are useful for diagnosing or creating developmental abnormalities of the central nervous 
system. 

FEATURES OF PROTEIN ENCODED BY GENE NO: 182 
5 Preferred polypeptides encoded by this gene comprise the following amino acid 

sequence: 

MPITOQVNPEUiDFMQSA 
FLACHPLMPIYFAAVWLYRE 

TFLFSFPHPNLLGRPLPNSKLRGRQPLLSKTLSWHQPSRGLIWCCGSGXRGLL 

L 0 RPEDRTKDVLTKPRTNRFVKL^ 

FP (SEQ ID NO:745); or CPEFFIPATLPCPFVFAFTSEASSRA YLTQRGPGGLAQ 
NLMPLPVGFWMGSLPPPWCWRKWV'SEACSCFC (SEQ ID NO:746) These 
polypeptides are structurally similar to various TGF-beta family members. Thus, this 
polypeptide is expected to have a variety of activities in the modulation of cell growth 

15 and proliferation. 

This gene is expressed primarily in osteoclastoma, microvascular endothelium, 
and bone marrow derived cell line's. 

Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s) or cell type(s) present in a 

20 biological sample and for diagnosis of diseases and conditions, which include, but are 
not limited to, hematological diseases particularly involving aberrant proliferation of 
stem cells. Similarly, polypeptides and antibodies directed to these polypeptides are 
useful in providing immunological probes for differential identification of the tissue(s) 
or cell type(s). "For a number of disorders of the above tissues or cells, particularly of 

25 the immune system, expression of this gene at significantly higher or lower levels may 
be routinely detected in certain tissues (e.g., cancerous and wounded tissues) or bodily 
fluids (e.g., serum, plasma, urine, synovial fluid or spinal fluid) or another tissue or 
cell sample taken from an individual having such a, disorder, relative to the standard 
gene expression level, i.e., the expression level in healthy tissue or bodily fluid from an 

30 individual not having the disorder. Preferred epitopes include those comprising a 
sequence shown in SEQ ID NO: 415 as residues: Ser-33 to Ala-39. 

The tissue distribution indicates that the protein products of this gene is useful 
for treating disorders of the progenitors of the immune system. Applications include in 
vivo expansion of progenitor cells, ex vivo expansion of progenitor cells, or the 

35 treatment of tumors of the circulatory system, such as lymphomas. 
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FEATURES OF PROTEIN ENCODED BY GENE NO: 183 

This gene maps to chromosome 17 and therefore, polynucleotides of the 
invention can be used in linkage analysis as a marker for chromosome 17. In specific 
embodiments, polypeptides of the invention comprise the sequence: 
5 GFGS VS A A G RRS GGTWQP V Q (SEQ ID NO:747): PGGLAVGSRWWSRSLT 
(SEQ ID NO: 748): LEP-S RQ RRP RRRGGTS RPETDQ RAKCWRQL (SEQ ID 
NO:749); anchor VCLRCQNRMEN (SEQ ID NO: 750). In further specific 
embodiments, polypeptides of the invention comprise the sequence: MAACTARRPGR 
GQPLVVPVADXGPVAKAALCAAXAG^ 
10 GVFEWKQNGKYETGQL^HSIFGYRGVVLFPWQARLXDRDVASA.APEK.AEN 
PAGHGSKEVKGKTHTYYQVLIDARDCPHISQRSQTEAVTFLANHDDSRALY.AIP 
GLDYVSHEDILPYTSTDQVPIQHELFERJFLLYDQTK.APPFV.ARETLRAW 
PWLELSDVHRETTEN1RVTVIPFY1VIGMREAQNSH 

LRERHV/RIFSLSGTLETVRGRGVVGREPVLSKEQP.AFQYSSHVSLQASSGHMW 

15 GTFRFERPDGSHFDVRIPPFSLESNKDEKTPPSGLHW (SEQ ID NO:75 1); 
MAACTARRPGRGQPLVYPVADXGPVAKAALCAA (SEQ ID NO:752); 
VLETVGVFEVPKQNGKYETGQLFLHSIFGYRGVVL (SEQ ID NO:757); 
GLDYVSHEDDLPYTST (SEQ ID NO:758); DVHRETTENIRVTVIPFYM (SEQ ID 
NO:759); WWRYCIRLENLDSDVVQLRER (SEQ ID NO:760); and/or PAFQYSS 

20 HVSLQASSGHMWGTFRFER (SEQ ID NO:761). Polynucleotides encoding these 
polypeptides are also encompassed by the invention. 

This gene is expressed primarily in gall bladder, prostate, and fetal brain, and to 
a lesser extent in a few tumor and fetal tissues. 

Therefore, polynucleotides and polypeptides of the invention are useful as 

25 reagents for differential identification of the tissue(s) or cell type(s) present in a 

biological sample and for diagnosis of diseases and conditions, which include, but are 
not limited to, growth related disorders such as cancers. Similarly, polypeptides and 
antibodies directed to these polypeptides are useful in providing immunological probes 
for differential identification of the tissue(s) or cell type(s). For a number of disorders 

30 of the above tissues or cells, particularly of the prostate, gall bladder, and fetal brain, 
expression of this gene at significantly higher or lower levels may be routinely detected 
in certain tissues (e.g., cancerous and wounded tissues) or bodily fluids (e.g., serum, 
plasma, urine, synovial fluid or spinal fluid) or another tissue or cell sample taken from 
an individual having such a disorder, relative to the standard gene expression level, i.e., 

35 the expression level in healthy tissue or bodily fluid from an individual not having the 
disorder. 
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The tissue distribution indicates that polynucleotides and polypeptides 
corresponding to this gene are useful for diagnosis and treatment of growth-related 
disorders, such cancers. 

FEATURES OF PROTEIN ENCODED BY GENE NO: 184 

In specific embodiments, polypeptides of the invention comprise the 
sequence.-SLCCPEGAEGC (SEQ ID NO:762) and/or Q L KKTH YD RPCP (SEQ ID 
NO:763). Polynucleotides encoding these polypeptides are also encompassed by the 
invention. 

This gene is expressed primarily in stromal cell, tonsil, and' glioblastoma and to 
a lesser extent in some other tissues. 

Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s) or cell type(s) present in a 
biological sample and for diagnosis of diseases and conditions, which include, but are 
not limited to, immune and inflammatory disorders and glioblastoma. Similarly, 
polypeptides and antibodies directed to these polypeptides, are useful in providing 
immunological probes for differential identification of the tissue(s) or cell type(s). For 
a number of disorders of the above tissues or cells, particularly of the stromal cells, 
tonsil, and glioblastoma expression of this gene at significantly higher or lower levels 
may be routinely detected in certain tissues (e.g., cancerous and wounded tissues) or 
bodily fluids (e.g., serum, plasma, urine, synovial fluid or spinal fluid) or another 
tissue or cell sample taken from an individual having such a disorder, relative to the 
standard gene expression level, i.e., the expression level in healthy tissue or bodily 
fluid from an individual not having the disorder. Additionally, it is believed that the 
product of this gene regulates pancreatic cell differentiation into beta cells. Accordinslv, 
polynucleotides and polypeptides of the invention are useful in the treatment of insulin- 
dependent diabetes meilicus and associated conditions e.g. pancreatic hypofunction and 
the prevention, as well as the treatment of undifferentiated type pancreatic cancers. 
Preferred epitopes include those comprising a sequence shown in SEQ ID NO: 417 as 
residues: Pro-27 to Ala-32. 

The tissue distribution indicates that polynucleotides and polypeptides 
corresponding to this gene are useful for diagnosis and treatment of immune and 
inflammatory disorders and glioblastoma. 

FEATURES OF PROTEIN ENCODED BY GENE NO: 185 

This gene is expressed primarily in hepatocellular carcinoma and to a lesser 
extent in other tissues. 
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Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s) or cell type(s) present in a 
biological sample and for diagnosis of diseases and conditions, which include, but are 
not limited to, liver diseases. Similarly, polypeptides and antibodies directed to these 
5 polypeptides are useful in providing immunological probes for differential identification 
of the tissue(s) or cell type(s). For a number of disorders of the above tissues or cells, 
particularly of the liver, expression of this gene at significantly higher or lower levels 
may be routinely detected in certain tissues (e.g., cancerous and wounded tissues) or 
bodily fluids (e.g., serum, plasma, urine, synovial fluid or spinal fluid) or another 

10 tissue or cell sample taken from an individual having such a disorder, relative to the 
standard gene expression level, i.e., the expression level in healthy tissue or bodily 
fluid from an individual not having the disorder. Preferred epitopes include those 
comprising a sequence shown in SEQ ID NO: 418 as residues: Gly-32 to Lys-39. 
The tissue distribution indicates that polynucleotides and polypeptides 

15 corresponding to this gene are useful for diagnosis and treatment of liver diseases. 

FEATURES OF PROTEIN ENCODED BY GENE NO: 186 

This gene is expressed primarily in hippocampus and to a lesser extent in other 

tissues. 

20 Therefore, polynucleotides and polypeptides of the invention are useful as 

reagents for differential identification of the tissue(s) or cell type(s) present in a 
biological sample and for diagnosis of diseases and conditions, which include, but are 
not limited to, neutronal disorders. Similarly, polypeptides and antibodies directed to 
these polypeptides are useful in providing immunological probes for differential 

25 identification of the tissue(s) or cell type(s). For a number of disorders of the above 

tissues or cells, particularly of the hippocampus, expression of this gene at significantly 
higher or lower levels may be routinely detected in certain tissues (e.g., cancerous and 
wounded tissues) or bodily fluids (e.g., serum, plasma, urine, synovial fluid or spinal 
fluid) or another tissue or cell sample taken from an individual having such a disorder, 

30 relative to the standard gene expression level, i.e., the expression level in healthy tissue 
or bodily fluid from an individual not having the disorder. 

The tissue distribution indicates that polynucleotides and polypeptides 
corresponding to this gene are useful for diagnosis and treatment of neuronal disorders. 



35 



FEATURES OF PROTEIN ENCODED BY GENE NO: 187 

This gene is expressed primarily in bone cancer and hippocampus and to a 
lesser extent in osteoclastoma and other tissues. 
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Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s) or cell type(s) present in a 
biological sample and for diagnosis of diseases and conditions, which include, but are 
not limited to, bone-related disorders and neuronal diseases. Similarly, polypeptides 
5 and* antibodies directed to these polypeptides are useful in providing immunological 
probes for differential identification of the tissue(s) or cell type(s). For a number of 
disorders of the above tissues or cells, particularly of the bone, ostoeciast, and 
hippocampus, expression of this gene at significantly higher or lower levels may be 
routinely detected in certain tissues (e.g., cancerous and wounded tissues) or bodily 

10 fluids (e.g.. serum, plasma, urine, synovial fluid or spinal fluid) or another tissue or 
cell sample taken from an individual having such a disorder, relative to the standard 
gene expression level, i.e., the expression level in healthy tissue or bodily fluid from an 
individual not having the disorder. 

The tissue distribution indicates that polynucleotides and polypeptides 

15 corresponding to this gene are useful for diagnosis and treatment of bone-related 
disorders and neuronal diseases. 

FEATURES OF PROTEIN ENCODED BY GENE NO: 188. 

This gene maps to chromosome 4 and therefore polynucleotides of the invention 
20 can be used in linkage analysis as a marker for chromosome 4. 

This gene is expressed primarily in neuronal tissues such as hippocampus, 
spinal cord, and hypothalamus and to a lesser extent in a few other tissues such as 
ovary. 

Therefore, polynucleotides and polypeptides of the invention are useful as 
25 reagents for differential identification of the tissue(s) or cell type(s) present in a 

biological sample and for diagnosis of diseases and conditions, which include, but are 
not limited to, neuronal diseases. Similarly, polypeptides and antibodies directed to . 
these polypeptides are useful in providing immunological probes for differential 
identification of the tissue(s) or cell type(s). For a number of disorders of the above 
30 tissues or cells, particularly of the neuronal tissues, expression of this gene at 

significantly higher or lower levels may be routinely detected in certain tissues (e.g., 
cancerous and wounded tissues) or bodily fluids (e.g., serum, plasma, urine, synovial 
fluid or spinal fluid) or another tissue or cell sample taken from an individual having 
such a disorder, relative to the standard gene expression level, i.e., the expression level 
35 in healthy tissue or bodily fluid from an individual not having the disorder. 

The tissue distribution indicates that polynucleotides and polypeptides 
corresponding to this gene are useful for diagnosis and treatment of neuronal disorders. 
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FEATURES OF PROTEIN ENCODED BY GENE NO: 189 

This gene maps to chromosome 10, therefore; polynucleotides of the invention 
can be used in linkage analysis as a marker for chromosome 10. 
5 This gene is expressed primarily in neuronal tissues and immune tissues, and to 

a lesser extent in a few other tissues such as skin tumor, iung etc. 

Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s) or cell type(s) present in a 
biological sample and for diagnosis of diseases and conditions, which include, but are 

10 not limited to, neuronal and immune-related disorders. Similarly, polypeptides and 

antibodies directed to these polypeptides are useful in providing immunological probes 
for differential identification of the tissue(s) or cell type(s). For a number of disorders 
of the above tissues or cells, particularly of the neuronal and immune-related tissues, 
expression of this gene at significantly higher or lower levels may be routinely detected 

15 in certain tissues (e.g., cancerous and wounded tissues) or bodily fluids (e.g., serum, 
plasma, urine, synovial fluid or spinal fluid) or another tissue or cell sample taken from 
an individual having such a disorder, relative to the standard gene expression level, i.e., 
the expression level in healthy tissue or bodily fluid from an individual not having the 
disorder. Preferred epitopes include those comprising a sequence shown in SEQ ID 

20 NO: 422 as residues: Pro- 19 to Asp-25. 

The tissue distribution indicates that polynucleotides and polypeptides 
corresponding to this gene are useful for diagnosis and treatment of neuronal and 
immune-related disorders. 

25 FEATURES OF PROTEIN ENCODED BY GENE NO: 190 

The translation product of this gene shares sequence homology with human 
N33, a gene located in a homozygously deleted region of human metastatic prostate 
cancer which is thought to be important in prevention of prostate cancer. In specific 
embodiments, polypeptides of the invention comprise the sequence: 

30 AQRKKEMVLSEKVSQLM^ 

LQLHRQCVVCKQADEEFQ1LANSWRYS 
. NS.APTFLNFPAKGKPKRGDTYELQVRGFSAEQLARWLADRTDVNIRVIRPPNNL^ 
ARWRFWCVSVT (SEQ ID NO:765); MVV.ALLIVCDVPSAS (SEQ ID NO:766); 
AQRKKEMVLSEKVSQL (SEQ ID NO:767); MEWTNKRPVIRiVLNGDKF (SEQ 

35 ro.768); RRLVKAPPRNYSVr^^ 

SS AFTNRJDFF A (SEQ ID NO:769): M VDFDEGS D VFQ lVDLNIVINS APTFINFP AK 
GKP (SEQ ID NO:770): KRGDTYELQVRGFS.AEQL^RWL\DRTDV"NIRVIRPPN 
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(SEQ ID NO:77 1); and/or YAGPLi/lLGLLLAVIGGLV^RRVTWNFSLIKLDGLLQL 
CVLCLL (SEQ ID NO:772). Polynucleotides encoding these polypeptides are also 
encompassed by the invention. 

This gene is expressed primarily in infant adrenal gland prostate cell line and to 
5 a lesser extent in a few other tissues like liver, smooch muscle etc. 

Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s) or cell type(s) present in a 
biological sample and for-diagnosis of diseases and conditions, which include, but are 
not limited to, prostate cancer and endocrine disorders. Similarly, polypeptides and 

10 antibodies directed to these polypeptides are useful in providing immunological probes 
for differential identification of the tissue(s) or cell type(s). For a number of disorders 
of the above tissues or cells, particularly of the prostate and adrenal gland, expression 
of this gene at significantly higher or lower levels may be routinely detected in certain 
tissues (e.g., cancerous and wounded tissues) or bodily fluids (e.g., serum, plasma, 

15 urine, synovial fluid or spinal fluid) or another tissue or cell sample taken from an 

individual having such a disorder, relative to the standard gene expression level, i.e., 
the expression level in healthy tissue or bodily fluid from an individual not having the 
disorder. Preferred epitopes include those comprising a sequence shown in SEQ ID 
NO: 423 as residues: Pro-34 to.Gly-43, Arg-i 13 to Pro- 120. 

20 The tissue distribution and homology to N33 indicates that polynucleotides and 

polypeptides corresponding to this gene are useful for diagnosis and treatment for 
prostate cancer and endocrine disorders. 

FEATURES OF PROTEIN ENCODED BY GENE NO: 191 

25 This gene is expressed primarily in T cell and to a lesser extent in fetal lung. 

Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s) or cell type(s) present in a 
biological sample and for diagnosis of diseases and conditions, which include, but are 
not limited to, immune disorders. Similarly, polypeptides and antibodies directed to 

30 these polypeptides are useful in providing immunological probes for differential 

identification of the tissue(s) or cell type(s). For a number of disorders of the above 
tissues or cells, particularly of the immune, expression of this gene at significantly 
higher or lower levels may be routinely detected in certain tissues (e.g., cancerous and 
wounded tissues) or bodily fluids (e.g., serum, plasma, urine, synovial fluid or spinal 

. 35 fluid) or another tissue or cell sample taken from an individual having such a disorder, 
relative to the standard gene expression level, i.e., the expression level in healthy tissue 
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or bodily fluid from an individual not having the disorder. Preferred epitopes. include 
those comprising a sequence shown in SEQ ID NO: 424 as residues: Trp-3 to Phe-9. 

The tissue distribution indicates that polynucleotides and polypeptides 
corresponding to this gene are useful for diagnosis and treatment of immune disorders. 

5 

FEATURES OF PROTEIN ENCODED BY GENE NO: 192 

This gene maps to chromosome 6, therefore, polynucleotides of the invention 
can be used in linkage analysis as a marker for chromosome 6. Neural activity and 
neurocrophins induce synaptic remodeling in part by altering gene expression. This 

10 gene is believed to be a glycosylphoshaudylinositol-anchored protein encoded by a 
hippocampal gene and to possess neural activity. This molecule is believed to be 
expressed in postmitotic-differentiating neurons of the developing nervous system and 
neuronal structures associated with plasticity in the adult. Message of this gene is 
believed to be induced by neuronal activity and by the activity-regulated neurotrophins 

15 BDNF and NT-3. The product of this gene is believed to stimulate neurite outgrowth 
and arborization in primary embryonic hippocampal and cortical cultures and to act as a 
downstream effector of activity-induced neurite outgrowth. In specific embodiments, 
polypeptides of the invention comprise the sequence: DAVFKGFSDCLLKLGDS (SEQ 
ID NO:773); CQEGAKDMWDKLRKESKNLN (SEQ ID NO:774); 

20 VLLVSLS AALATWLSF (SEQ ID NO:775); MGLKLNGRYISLILA VQIAYLVQAVR 
AAGKCDAVFKGFSDCLLKLGDS (SEQ ID NO:776); PAAWDDKTNIKTVCTYW 
EDFHSCTVTALTDCQEGAKD^rWDKLRKESKNLNIQGSLFELCGSGNGAAGSL 
LPAFPVLLVSLS AALATWLSF (SEQ ID NO:777); and/or MGLKLNGRYISLILA 

VQL\YLVQAVR.^AGKCDAWKGFSDCIXKLGDSXXXXXPAAWDDKTNIKTVC 
25 TYWEDFHSCTVTALTDCQEGAKDMWDK^ 

GSLLPAFP VLLVSLS AALATWLSF (SEQ ID NO:778). Polynucleoddes encoding 

this polypeptide are also encompassed by the invention. 

This gene is expressed primarily in human-placenta, endometrial tumor and 

tissues of the central nervous system (CNS). 
30 Therefore, polynucleotides and polypeptides of the invention are useful as 

reagents for differential identification of the tissue(s) or cell type(s) present in a 

biological sample and for diagnosis of diseases and conditions, which include, but are 

not limited to, relating to reproductive disorders, cancers and neurological diseases. 

Similarly, polypeptides and antibodies directed to these polypeptides are useful in 
35 providing immunological probes for differential identification of the tissue(s) or cell 

type(s). For a number of disorders of the above tissues or cells, particularly of the 

reproductive and neurological disorders, expression of this gene at significantly higher 
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or lower levels may be routinely detected in certain- tissues (e.g., cancerous and 
wounded tissues) or bodily fluids (e.g.," serum, plasma, urine, synovial fluid or spinal 
fluid) or another tissue or cell sample taken from an individual having such a disorder, 
relative to the standard gene expression level, i.e., the expression level in healthy tissue 
5 or bodily fluid from an individual not having the disorder. Preferred epitopes include 
those comprising a sequence shown in SEQ ID NO: 425 as residues: Asp-47 to Asp- 
63, His-75 to Tyr-80, Pro-83 to Tyr-89. 

The tissue distribution indicates that polynucleotides and polypeptides 
corresponding to this gene are useful for diagnosis and treatment of reproductive 
10 disorders such as endometrial- tumors. Expression of this gene in tissues of the CNS 

and its strong homology to Neuritin suggest that the protein product from- this gene may 
also be used in the treatment and diagnosis of neurological disorders and in the 
regeneration of neural tissues, e.g., following injury. 

15 FEATURES OF PROTEIN ENCODED BY GENE NO: 193 

The translation product of this gene shares sequence homology with tenascin 
which is thought to be important in development. The translation product of this gene is 
believed to be a ligand of the fibroblast growth factor family. FGF ligand activity is 
known in the an and can be assayed b,y methods known in the art and disclosed 
20 elsewhere herein. 

This gene is expressed primarily in endometrial tumors, and other types of 
tumors. 

Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s) or cell type(s) present-in a 

25 biological sample and for diagnosis of diseases and conditions, which include, but are 
not limited to, cancers. Similarly, polypeptides and antibodies directed to these 
polypeptides are useful in providing immunological probes for differential identification 
of the tissue(s) or cell type(s). For a number of disorders of the above tissues or cells, 
particularly of the cancer tissues, expression of this gene at significantly higher or lower 

30 levels may be routinely detected in certain tissues (e.g., cancerous and wounded 

tissues) or bodily fluids (e.g., serum, plasma, urine, synovial fluid or spinal fluid) or 
another tissue or cell sample taken from an individual having such a disorder, relative to 
the standard gene expression level, i.e., the expression level in healthy tissue or bodily 
fluid from an individual not having the disorder. Preferred epitopes include those 

35 comprising a sequence shown in SEQ ID NO: 426 as residues: GIy-29 to Glu-34, Arg- 
71 to Arg-76, Thr-176 to Cys-182, Gly-184 to Glu-199. 
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The tissue distribution and homology to tenascin indicates that polynucleotides 
and polypeptides corresponding to this gene are useful for diagnosis and treatment of 
cancers. 

FEATURES OF PROTEIN ENCODED BY GENE NO: 194 

In specific embodiments, polypeptides of the invention comprise the sequence: 
MNS AA GFS HLDRRER VLKLGES FEKQ P RC ASTL C (SEQ ID NO:779). 
Polynucleotides encoding these polypeptides are also encompassed by the invention. 

Tnis gene is expressed primarily in fetal human lung and neutrophils. 

Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s) or cell type(s) present in a 
biological sample and for diagnosis of diseases and conditions, which include, but are 
not limited to; lung development and respiratory disorders. Similarly, polypeptides and 
antibodies directed to these polypeptides are useful in providing immunological probes 
for differential identification of the tissue(s) or cell type(s). For a number of disorders 
of the above tissues or cells, particularly of the respiratory system, expression of this 
gene at significantly higher or lower levels may be routinely detected in certain tissues 
(e.g., cancerous and wounded tissues) or bodily fluids (e.g., serum, plasma, urine, 
synovial fluid or spinal fluid) or another tissue or cell sample taken from an individual 
having such a disorder, relative to the standard gene expression level, i.e., the 
expression level in healthy tissue or bodily fluid from an individual not having the 
disorder. 

The tissue distribution in fetal lung and neutrophils indicates that 
polynucleotides and polypeptides corresponding to this gene are useful for diagnosis 
and treatment of lung and immunity related diseases, for example, lung cancer, viral, 
fungal or bacterial infections (e.g. lesions caused by tuberculosis), inflammation (e.g. 
pneumonia), metabolic lesions etc. 

FEATURES OF PROTEIN ENCODED BY GENE NO: 195 

This gene is expressed primarily in breast lymph node. 

Therefore, polynucleotides and polypeptides of the invendon are useful as 
reagents for differential identification of the tissue(s) or cell type(s) present in a 
biological sample and for diagnosis of diseases and conditions, which include, but are 
not limited to, immunai disorders. Similarly, polypeptides and antibodies directed to 
these polypepddes are useful in providing immunological probes for differential 
identification of the tissue(s) or cell type(s). For a number of disorders of the above 
dssues or cells, particularly of the immune system, expression of this gene at 
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significantly higher or lower levels may be routinely detected in certain tissues (e.g., 
cancerous and wounded tissues) or bodily Fluids (e.g., serum, plasma, urine, synovial 
fluid or spinal fluid) or another tissue or cell sample taken from an individual having 
such a disorder, relative to the standard gene expression level, i.e., the expression level 
5 in healthy tissue or bodily fluid from an individual not having the disorder. 

The tissue distribution indicates that polynucleotides and polypeptides 
corresponding to this gene are useful for the diagnosis and treatment of immunal 
disorders. 

10 FEATURES OF PROTEIN ENCODED BY GENE NO: 196 

This gene maps to chromosome 5 and accordingly, polynucleotides of the invention can 
be used in linkage analysis as a marker for chromosome 5. The translation product of 
this gene shares sequence homology with human M-phase phosphoprotein 4 which is 
thought to be important in phosphorylation and signal transduction processes. In 

15 specific embodiments, polypeptides of the invention comprise the sequence: 
TIYPTEEELQAVQKIVSITERALKLVSD (SEQ ED NO:780); RALKGVLRV 
GVLAKGLLLRGDRNVNLVLLC (SEQ ID NO:731): ALAALRHAKWFQARAN 
GLQSCVIHRILRDLCQRVPTWS (SEQ ID NO:782); GDALRRVFECISSGIIL (SEQ 
ID NO:7S3); LAFRQIHKVLGMDPLP (SEQ ID NO:7S4); and/or TIYPTEEELQAVQ 

20 KJVSITERALKLVSDSLSEHEKNK^KEGDDKKEGGKDRALKGVLRVGVLAKG 
IXLRGDRNVNLVLLCSEKP^ 

NSCV^PKiMQVTn-LTSPnREENMREGDVTSGMVKDPPDVLDRQKCLDAJLAALR 
HAKWFQARANGLQSCVinRILRDLCQRVPTWSDFPSW 

QSPGDALRRVFECISSGIILKGSPGLLDPCEKDPFDTLATMTDQQREDITSSAQFA 
25 LRLLAFRQIHKVLGMDPLPQMSQRFNIHN'NRKRRRDSDGVDGFEAEGKKDKK 
DYDNF (SEQ ID NO:785). Polynucleotides encoding these polypeptides are also 
encompassed by the invention. 

This gene is expressed primarily in Human Hippocampus and to a lesser extent 
in Prostate, Human Frontal Cortex. 
30 Therefore, polynucleotides and polypeptides of the invention are useful as 

reagents for differential identification of the tissue(s) or cell type(s) present in a 
biological sample and for diagnosis of diseases and conditions, which include, but are 
not limited to, disorders related to reproductive system and. nervous system. Similarly, 
polypeptides and antibodies directed to these polypeptides are useful in providing 
35 immunological, probes for differential identification of the tissue(s) or cell rypefs). For 
a number of disorders of the above tissues or cells, particularly of the reproductive 
system and nervous system, expression of this gene at significantly higher or lower 
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levels may be routinely detected in certain tissues (e.g., cancerous and wounded 
tissues) or bodily fluids (e.g., serum, plasma, urine 7 synovial fluid or spinal fluid) or 
another tissue' or cell sample taken from an individual having such a disorder, relative to 
the standard gene expression level, i.e., the expression level in healthy tissue or bodily 
5 fluid from an individual not having the disorder. 

The tissue distribution and homology to human M-phase phosphoprotein 4 
indicates that polynucleotides and polypeptides corresponding to this gene are useful for 
the diagnosis and treatment of reproductive and nervous system disorders. 

10 FEATURES OF PROTEIN ENCODED BY GENE NO: 197 

In specific embodiments, polypeptides of the invention comprise the sequence: 
MGS QHS AAARPS S CRRKQEDDRDG (SEQ ID NO:7S6); 
LiAEREQEEAL\QFPYVEFTGRDSITCLTC (SEQ ID NO:737); and/or 
QGTGYIPTEQVNELVALIPHSDQRLRPQRTKQYV (SEQ ID NO:788). 

15 Polynucleotides encoding these polypeptides are also encompassed by the invention. 

This gene is expressed primarily in Human' Primary Breast Cancer and to a 
- lesser extent in Human Adult Spleen, Hodgkin's Lymphoma I, Salivary Gland. 
. Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s) or cell type(s) present in a 

20 biological sample and for diagnosis of diseases and conditions, which include, but are 
not limited to, cancer and immunal disorders. Similarly, polypeptides and antibodies 
directed to these polypeptides are useful in providing immunological probes for 
differential identification of the tissue(s) or cell type(s). For a number of disorders of 
the above tissues or cells, particularly of the cancer and immune system, expression of 

25 ■ this gene at significantly higher or lower levels maybe routinely detected in certain 
tissues (e.g., cancerous and wounded tissues) or bodily fluids (e.g., serum, plasma, 
urine, synovial fluid or spinal fluid) or another tissue or cell sample taken from an 
individual having such a disorder, relative to the standard gene expression level, i.e., 
the expression level in healthy tissue or bodily fluid from an individual not having the 

30 disorder. Preferred epitopes include those comprising a sequence shown in SEQ ID 
NO: 430 as residues: Ser-126 to Gly-138. 

The tissue distribution indicates that polynucleotides and polypeptides 
corresponding to this gene are useful for the diagnosis and treatment of cancer and 
immunal disorders. 



35 
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- FEATURES OF PROTEIN ENCODED BY GENE NO: 198 
This gene is expressed primarily in monocytes. 

Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differencial identification. of the tissue(s) or cell cype(s) present in a 
5 biological sample and for diagnosis of diseases and conditions, which include, but are 
not limited to, blood cell disorders. Similarly, polypeptides and antibodies- directed to 
these polypeptides are useful in providing immunological probes for differential 
identification of the tissue(s) or ceil type(s). For a number of disorders of the above 
tissues or cells, particularly of the immune system, expression of this gene at 
10 significantly higher or lower levels may be routinely detected in certain tissues (e.g., 
cancerous and wounded tissues) or bodily fluids (e.g., serum, plasma, urine, synovial 
fluid or spinal fluid) or another tissue or cell sample taken from an individual having 
such a disorder, relative to che standard gene expression level, i.e., the expression level 
in healthy tissue or bodily fluid from an individual not having the disorder. 
- "15 The tissue distribution indicates that polynucleotides and polypeptides 

corresponding to this gene are useful for the diagnosis and treatment of blood cell 
disorders. 

FEATURES OF PROTEIN ENCODED BY GENE NO: 199 
20 This gene is expressed primarily in Human Ovary and Synovia and to a lesser ■ 

extent in Human 8 Week Whole Embryo. 

Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s) or cell type(s) present in a 
biological sample and for diagnosis of diseases and conditions, which include, but are 
25 not limited to, reproductive and developmental disorders. Similarly, polypeptides and 
antibodies directed to these polypeptides, are useful in providing immunological probes 
for differential identification of the tissue(s) or cell type(s). For a number of disorders 
of the above tissues or cells, particularly of the reproductive and developmental system,- 
expression of this gene at significantly higher or lower levels may be routinely detected 
30 in certain tissues (e.g., cancerous and wounded tissues) or bodily fluids (e.g., serum, 
plasma, urine, synovial fluid or spinal fluid) or another tissue or cell sample taken from 
an individual having such a disorder, relative to the standard gene expression level, i.e., 
the expression level in healthy tissue or bodily fluid from an individual not having the 
disorder. 

35 The tissue distribution indicates that polynucleotides and polypeptides 

corresponding to this gene are useful for the diagnosis and treatment of reproductive 
and developmental disorders. 
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FEATURES OF PROTEIN ENCODED BY GENE NO: 200 

This gene maps to chromosome 8 and therefore polynucleotides of the invention 
can be used in linkage analysis as a marker for chromosome 8. The translation product 
of this gene shares limited sequence homology with collagen proline rich domain. 

This gene is expressed primarily in CNS. 

Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential-identification of the 1 tissue(s) or cell type(s) present in a 
biological sample and for diagnosis of diseases and conditions, which include, but are 
not limited to, neurological diseases. Similarly, polypeptides and antibodies directed to 
these polypeptides are useful in providing immunological probes for differential 
identification of the tissue(s) or cell type(s). For a number of disorders of the above 
tissues or cells, particularly of the nervous system, expression of this gene at 
significantly higher or lower levels may be routinely detected in certain tissues (e.g., 
cancerous and wounded tissues) or bodily fluids (e.g., serum, plasma, urine, synovial 
fluid or spinal fluid) or another tissue or cell sample taken from an individual having 
such a disorder, relative to the standard gene expression level; i.e., the expression level 
in healthy tissue or bodily fluid from an individual not having the disorder. Preferred 
epitopes include those comprising a sequence shown in SEQ ID NO: 433 as residues: 
Pro-35 to Asp-41. 

The tissue distribution indicates that polynucleotides and polypeptides 
corresponding to this gene are useful for diagnosis and treatment of neurological 
diseases. 

FEATURES OF PROTEIN ENCODED BY GENE NO: 201. 

Translation product of this gene shares homology with a mammalian histone 
HI a protein. One embodiment for this gene is the polypeptide fragments comprising the 
following amino acid sequence: ARLNVGRESLKREMLKSQGVKVSESPMGAR 
HSSWPEG A^FCKKVQGAQMQFPPRR (SEQ ID NO:789); ARLNVGRESLKR 
EML (SEQ ID NO:790); LKSQGVKVSESPMGARHSSW (SEQ ID NO:791); 
AFCKKVQG AQMQFPPRR (SEQ ID NO:792). An additional embodiment is the 
polynucleotide fragments encoding these polypeptide (See Accession No. pirlS24178) 
fragments. 

This gene is expressed primarily in neutrophils. 

Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s) or cell type(s) present in a 
biological sample and for diagnosis of diseases and conditions, which include, but are 
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/ not limited to, immune disorders. Similarly, polypeptides and antibodies directed to 
these polypeptides are useful in providing immunological probes for differential 
identification of the tissue(s) or cell type(s). For a number of disorders of the above 
tissues or cells, particularly of the immune system, expression of this gene at 

5 significantly higher or lower levels may be routinely detected in certain tissues (e.g., 
cancerous and wounded tissues) or bodily fluids (e.g., serum, plasma, urine, synovial 
fluid or spinal fluid) or another tissue or ceil sample taken from an individual having 
such a disorder, relative to the standard gene expression level, i.e., the expression level 
in healthy tissue or bodily fluid from an individual not having the disorder. 

10 The tissue distribution indicates that polynucleotides and polypeptides 

corresponding to this gene are useful for diagnosis and treatment of immune disorders. 
Since the gene is expressed in ceils of lymphoid origin, the natural gene product may be 
involved in vital immune functions. Therefore it may be also used as an agent for 
immunological disorders including arthritis, asthma, immune deficiency diseases such 

15 as AIDS, and leukemia. 

FEATURES OF PROTEIN ENCODED BY GENE NO: 202 
This gene is expressed primarily in neutrophils. 

Therefore, polynucleotides and polypeptides of the invention are useful as 

20 reagents for differential identification of the tissue(s) or cell type(s) present in a 

biological sample and for diagnosis of diseases and conditions, which include, but are 
not limited to, immune disorders. Similarly, polypeptides and antibodies directed to 
these polypeptides are useful in providing immunological probes for differential 
identification of the.tissue(s) or cell type(s). For a number of disorders of the above 

25 tissues or cells, particularly of the immune system, expression of this gene at 

significantly higher or lower levels may be routinely detected in certain tissues (e.g., 
cancerous and wounded tissues) or bodily fluids (e.g., serum, plasma, urine 7 synovial 
fluid or spinal fluid) or another tissue or cell sample taken from an individual having 
such a disorder, relative to the standard gene expression level, i.e., the expression level 

30 in healthy tissue or bodily fluid from an individual not having the disorder. 

The tissue distribution indicates that polynucleotides and polypepddes 
corresponding to this gene are useful for diagnosis and treatment of immune disorders. 
Since the gene is expressed in cells of lymphoid origin, the natural gene product may be 
involved in immune functions. Therefore it may be also used as an agent for 

35 immunological disorders including arthritis, asthma, immune deficiency diseases such 
as AIDS, and leukemia. 
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FEATURES OF PROTEIN ENCODED BY GENE NO: 203 
This gene is expressed primarily in Neutrophils. 

Therefore, polynucleotides and polypeptides of the invention are useful as 

5 reagents for differential identification of the tissue(s) or cell type(s) present in a 

biological sample and for diagnosis of diseases and conditions, which include, but are 
not limited to, infectious disorders, immune disorders, and cancers. Similarly, 
polypeptides and antibodies directed to these, polypeptides are useful in providing 
immunological probes for differential identification of the tissue(s) or cell type(s). For 

10 a number of disorders of the above tissues or cells, particularly of the immune system, 
expression of this gene at significantly higher or lower levels may be routinely detected 
in certain tissues- (e.g., cancerous and wounded tissues) or bodily fluids (e.g., serum, . 
plasma, urine, synovial fluid or spinal fluid) or another tissue or cell sample taken from 
an individual having such a disorder, relative to the standard gene expression level, i.e., 

15 the expression level in healthy tissue or bodily fluid from an individual not having the 
disorder. Preferred epitopes include those comprising a sequence shown in SEQ ID 
NO: 436 as residues: Thr-31 to Lys-36. 

The tissue distribution indicates that polynucleotides and polypeptides 
corresponding to this gene are useful for diagnosis and treatment of infectious 

20 , disorders, immune disorders, and cancers. Since the gene is expressed in cells of 
lymphoid origin, the natural gene product may be involved in immune functions. 
Therefore it may be also used as an agent for immunological disorders including - 
arthritis, asthma, immune deficiency diseases such as AIDS, and leukemia. Protein, as 
well as, antibodies directed against the protein may show utility as a tumor marker 

25 and/or immunotherapy targets for the above listed tumors and tissues. 

FEATURES OF PROTEIN ENCODED BY GENE NO: 204 

This gene maps to chromosome 16 and therefore polynucleotides of the 
invention can be used in linkage analysis as markers for chromosome 16. The 
30 translation product of this gene shares sequence homology with lactate dehydrogenase 
which is thought to be important in lactate metabolism. 

This gene is expressed primarily in human tonsils and to a lesser extent in 
Spleen, and Neutrophils. 

Therefore, polynucleotides and polypeptides of the invention are useful as 
35 reagents for differential identification of the tissue(s) or cell type(s) present in a 

biological sample and for diagnosis of diseases and conditions, which include, but are 
not limited to, immune disorders, infectious disorders, and cancers. Similarly, 
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polypeptides and antibodies directed to these polypeptides are useful in providing 
immunological probes for differential identification of the tissue(s) or ceil rype(s). For 
a number of disorders of the above tissues or ceils, particularly of the immune 
disorders, infectious disorders, and cancers, expression of this gene at significantly 
5 higher or lower levels may be routinely detected in certain tissues (e.g., cancerous and 
wounded tissues) or bodily fluids (e.g., serum, plasma, urine, synovial fluid or spinal 
fluid) or another tissue or ceil sample taken from an individual having such a disorder, 
relative to the standard gene expression level, i.e., the expression level in healthy tissue 
or bodily fluid from an individual not having the disorder. Preferred epitopes include 
10 those comprising a sequence shown in SEQ ID NO: 437 as residues: Gly-7 to Ser-12. 

The tissue distribution and homology to lactate dehydrogenase gene indicates 
that polynucleotides and polypeptides corresponding to this gene are useful for 
diagnosis and treatment of immune disorders, infectious disorders, and cancers. 

15 FEATURES OF PROTEIN ENCODED BY GENE NO: 205 

The translation product of this gene shares sequence homology with Gcapl 
protein which is developmentally regulated in brain. 

This gene is expressed primarily in placenta and endometrial tumor and to a 
lesser extent in several other tumors. 

20 Therefore, polynucleotides and polypeptides of the invention are useful as 

reagents for differential identification of the tissue(s) or cell type(s) present in a 
biological sample and for diagnosis of diseases and conditions, which include, but are 
not limited to, vasculogenesis/angiogenesis and tumorigenesis. Similarly, polypeptides 
and antibodies directed to these polypeptides are useful in providing immunological 

25 probes for differential identification of the tissue(s) or cell type(s). For a number of 
disorders of the above tissues or cells, particularly of the vascular system and tumors, 
expression of this gene at significantly higher or lower levels may be routinely detected 
in certain tissues (e.g., cancerous, and wounded tissues) or bodily fluids (e.g., serum, 
r plasma, urine, synovial fluid or spinal fluid) or another tissue or cell sample taken from 

30 an individual having such a disorder, relative to the standard gene expression level, i.e., 
the expression level in healthy tissue or bodily fluid from an individual not having the 
disorder. 

The tissue distribution and homology to Gcap 1 protein indicates that 
polynucleotides and polypeptides corresponding to this gene are useful for diagnosis 
35 and treatment of disorder or dysfunction of vascular system of tumorigenesis. 
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FEATURES OF PROTEIN ENCODED BY GENE NO: 206 

In specific embodiments, polypeptides of the invention comprise the sequence 
MP Y A Q WL AEND RPEE A Q KAFHKAGRQ RE A (SEQ ID NO:799); 
VQVXEQLTNNAV^SRF^A.^YY^'WNLSMQCLDUQD (SEQ ID NO:794); 

5 PAQKDTMLGKFYHFQRLAELYHGYHAIHRHTEDP (SEQ ID NO: 795); 
FSVHRPETLFNISRFLLHSLPKDTPSGISKVKILFT (SEQ ID NO:800); 
LAKQSKALGAYRLARHAYDKERGLYIP (SEQ ID NO:796); ARFQKSIELG - 
TLTIR^KPFHDSEELVPLCYRCSTNN (SEQ ID NO: 797); and/or PULNNLGNVC 
INCRQPFrFSASSYDVTHLVEF\TJEEGITDEE.AlSLIDLEVLRPKRDDRQLEICKQQ 

10 LPDSCG (SEQ ID NO:798). Polynucleotides encoding these polypeptides are also 
encompassed by the invention. 

This gene is expressed primarily in testes. 
■ Therefore, polynucleotides and polypeptides of the invention 'are useful as 
reagents for differential identification of the tissue(s) or ceil type(s) present in a 

15 biological sample and for diagnosis of diseases and conditions, which include, but are 
not limited to, male reproductive and endocrine disorders. Similarly, polypeptides and 
antibodies directed to these polypeptides are useful in providing immunological probes 
for differential identification of the tissue(s) or cell type(s). For a number of disorders 
of the above tissues or cells, particularly of the reproductive, and endocrine systems, 

20. expression of this gene at significantly higher or lower levels may be routinely detected 
in certain tissues (e.g., cancerous and wounded tissues) or bodily fluids (e.g., serum, 
plasma, urine, synovial fluid or spinal fluid) or another tissue or cell sample taken from 
an individual having such a disorder, relative to the standard gene expression level, i.e., 
the expression level in healthy tissue or bodily fluid from an individual not having the 

25 disorder. 

The tissue distribution indicates that polynucleotides and polypeptides 
. corresponding to this gene are useful for treatment of male reproductive and endocrine 
disorders. 

30 FEATURES OF PROTEIN ENCODED BY GENE NO: 207 
This gene is expressed in fetal lung. 

Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s) or cell type(s) present in a 
biological sample and for diagnosis of diseases and conditions, which include, but are 
35 not limited to, lung diseases such as cystic fibrosis. Similarly, polypeptides and 

antibodies directed to these polypeptides are useful in providing immunological probes 
for differential identification of the tissue(s) or cell typefs). .For a number of disorders- 
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of the above tissues or cells, particularly of the respiratory system, expression of this 
gene at significantly higher or lower levels may be routinely detected in certain tissues 
(e.g., cancerous and wounded tissues) or bodily fluids (e.g., serum, plasma, urine, 
synovial fluid or spmal fluid) or another tissue or cell sample taken from an individual 
having such a disorder, relative to the standard gene expression level, i.e., the 
expression level in- healthy tissue or bodily fluid from an individual not having the 
disorder. Preferred epitopes include those comprising a sequence shown in SEQ ID 
NO: 440 as residues: Tyr-49 to Cys-54. 

The tissue distribution indicates that polynucleotides and polypeptides 
corresponding to this gene are ; useful for detection and treatment of disorders associated 
with developing lungs particularly in premature infants where the lungs are the last 
tissues to develop. The tissue distribution indicates that polynucleotides and 
polypeptides corresponding to this gene are useful for diagnosis and intervention of 
lung tumors since the gene may be involved in the regulation of cell division, 
particularly since it is expressed in fetal tissue. Protein, as well as, antibodies directed 
* against the protein may show utility as a tumor marker and immunotherapy targets for 
the above listed tumors and tissues/ 
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Table 1 summarizes che information corresponding- to each ''Gene No." described 
above. The nucleotide sequence identified as "NT SEQ ID NO:X" was assembled from- 
partially homologous ("overlapping") sequences obtained from the "cDNA clone ID" 
identified in Table 1 and, in some cases, from additional related DNA clones. The 
5 overlapping sequences were assembled -into a single contiguous sequence of high 
redundancy (usually three to five overlapping sequences at each nucleotide position), 
resulting in a final sequence identified as SEQ ID NO:X. 

The cDNA Clone ID was deposited on the date and given the corresponding 
deposit number listed in "ATCC Deposit No:Z and Date." Some of the deposits contain 
10 multiple different clones corresponding to the same gene. "Vector" refers to the type of 
vector contained' in the cDN A Clone ID: 

'Total NT Seq." refers to the total number of nucleotides in the contig identified 
. by "Gene No;'" The deposited clone may contain all or most of these sequences, 
reflected by the nucleotide position indicated as NT of Clone Seq." and the "3 T NT 
1 5 of Clone Seq." of SEQ ID NO:X. The nucleotide position of SEQ ID NO:X of the 
putative stan codon (methionine) is identified as "5' NT 'of Start Codon." Similarly , 
the nucleotide position of SEQ ID NO:X of the predicted signal sequence is identified as 
"5* NT of First AA of Signal Pep." 

The translated amino acid sequence, beginning with the methionine, is identified 
20 as "AA SEQ ID NO:Y," although other reading frames can also be easily translated 
using known molecular biology techniques. The polypeptides produced by these 
alternative open reading frames are specifically contemplated by the present invention. 

The first and last amino acid position of SEQ ID NO.Y of the predicted signal 
peptide is identified as "First AA of Sig Pep" and "Last AA of Sig Pep." The predicted 
25 first amino acid position of SEQ ID NO:Y of the secreted portion is identified as 

"Predicted First AA of Secreted Portion." Finally, the amino acid position of SEQ ID 
NO: Y of the last amino acid in the open reading frame is identified as "Last AA of 
ORF." 

SEQ ID NO:X and the translated SEQ ID NO: Y are sufficiently accurate and 
30 otherwise suitable for a variety of uses well known in the an and described further 

below. For instance, SEQ ID NO:X is useful for designing nucleic acid hybridization 
probes that will detect nucleic acid sequences contained in SEQ ID NO:X or the cDNA 
contained in the deposited clone. These probes will also hybridize to nucleic acid 
molecules in biological samples, thereby enabling a variety of forensic and diagnostic 
35 methods of the invention. Similarly, polypeptides identified from SEQ ID NO:Y may 
be used to generate anybodies which bind specifically to the secreted proteins encoded 
by the cDNA clones identified in Table 1. . 
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Nevertheless, DNA sequences generated by sequencing reactions can contain 
sequencing errors. The errors exist as misidentified nucleotides, or as insertions or 
deletions of nucleotides in the generated DNA sequence. The erroneously inserted or 
deleted nucleotides cause frame shifts in the reading frames of the predicted amino acid 

5 sequence. In these cases, the predicted amino acid sequence diverges from the actual 
amino acid sequence, even though the generated DNA sequence may be greater than 
99.9% identical to the actual DNA sequence (for example, one base insertion or deletion 
in an open reading frame of over 1000 bases). 

Accordingly, for those applications requiring precision in the nucleotide 

10 sequence or the amino acid sequence, the present invention provides not only the 

generated nucleotide sequence identified as SEQ ID NO:X and the predicted translated 
amino acid sequence identified as SEQ ID NO:Y, but aiso a sample of plasmid DNA 
containing a human cDNA of the invention deposited with the ATCC, as set forth in 
Table I. The nucleodde sequence of each deposited clone can readily be determined by 

15 sequencing the deposited clone in accordance with known methods. The predicted 

amino acid sequence can then be verified from such deposits. Moreover, the amino " 
acid sequence of the protein encoded by a particular clone can also be directly 
determined by peptide sequencing or by expressing the protein in a suitable host cell 
containing the deposited human cDNA, collecting the protein, and determining its 

20 sequence. 

The present invention also relates to the genes corresponding to SEQ ID NO:X, 
SEQ ID NO: Y, or the deposited clone. The corresponding gene can be isolated in 
accordance with known methods using the sequence information disclosed herein. 
Such methods include preparing probes or primers from the disclosed sequence and 
25 identifying or amplifying the corresponding gene from appropriate sources of genomic 
material. 

Also provided in the present invention are species homologs. Species 
homologs maybe isolated and identified by making suitable probes or primers from the 
sequences provided herein and screening a.suitable nucleic acid source for the desired 
30 homologue. 

The polypeptides of the invention can be prepared in any suitable manner. Such 
polypeptides include isolated naturally occurring polypeptides, recombinantly produced 
polypeptides, synthetically produced polypeptides, or polypepudes produced by a 
combination of these methods. Means for preparing such polypepudes are well 
35 understood in the art. 

The polypeptides may be in the form of the secreted protein, including the 
mature form, or may be a pan of a larger protein, such as a fusion protein (see below). 
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It is often advantageous to include an additional amino acid sequence which contains 

secretory or leader sequences, pro-sequences, sequences which aid in purification , 

such as multiple histidine residues, or an additional sequence for stability during 

recombinant production. 
5 The polypeptides of the present invention are preferably provided in an isolated 

form, and preferably are substantially purified. A recombinantly produced version of a 
; polypeptide, including the secreted polypeptide, can be substantially purified by the 

one -step method described in Smith and Johnson, t Gene 67:31-40 (1988). 

Polypeptides of the invention also can be purified from natural or recombinant sources 
10 using antibodies of the invention raised against the secreted protein in methods which 

are well known in the art. 

Signal Sequences 

Methods for predicting whether a protein has a signal sequence, as well as the 
15 cleavage point for that sequence, are available. For instance, the method of McGeoch, 
Virus Res. 3:271-286 (1985), uses the information from a short N-terminal charged 
region and a subsequent uncharged region of the complete (uncleaved) protein. The 
method of von Heinje, Nucleic Acids Res. 14:4683-4690 (1986) uses the information 
from the residues surrounding the cleavage site, typically residues -13 to +2, where -rl 
20 indicates the amino terminus of the secreted protein. The accuracy of predicting the 

cleavage points of known mammalian secretory proteins for each of these methods is in 
the range of 75-80%. (von Heinje. supra.) However, the two methods do not always 
produce the same predicted cleavage point(s) for a given protein. 

In the present case, the deduced amino acid sequence of the secreted polypeptide 
25 was analyzed by a computer program called SignalP (Henrik Nielsen et al., Protein • 
Engineering 10:1-6 (1997)), which predicts the cellular location of a protein based on 
the amino acid sequence. As part of this computational prediction of localization, the 
' methods of McGeoch and von Heinje are incorporated. The analysis of the amino acid 
sequences of the secreted proteins described herein by this program provided the results 
30. shown in Table 1 . 

As one of ordinary- skill would appreciate, however, cleavage sites sometimes 
vary from organism to organism and cannot be predicted with absolute certainty. 
Accordingly, the present invention provides secreted polypeptides having a sequence 
shown in SEQ.ID NO:Y which have an N-terminus beginning within 5 residues (i.e., + 
35 or - 5 residues) of the predicted cleavage point. Similarly, it is also recognized that in 
some cases, cleavage of the signal sequence from a secreted protein is not entirely 
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uniform, resulting in more than one secreted species. These polypeptides, and the 
polynucleotides encoding such polypeptides, are contemplated by the present invention. 

Moreover, the signal sequence identified by the above analysis may not 
necessarily predict the naturally occurring signal sequence. For example, the naturally 
5 occurring signal sequence may be further upstream from the predicted signal sequence. 
However, it is likely that the predicted signal sequence will be capable of directing the 
secreted protein to the ER. These polypeptides, and the polynucleotides encoding such 
polypeptides, are contemplated by the present invention. 

10 Polynucleotide and Polypeptide Variants 

"Variant" refers to a polynucleotide or polypeptide differing from the 
polynucleotide or polypeptide of the present invention, but retaining essential properties 
thereof. Generally, variants are overall closely similar, and, in many regions, identical 
to the polynucleotide or polypeptide of the present invention, 

15 By a polynucleotide having a nucleotide sequence at least, for example, 95% 

"identical" to a reference nucleotide sequence of the present invention, it is intended that 
the nucleotide sequence of the polynucleotide is identical to the reference sequence 
except that the polynucleotide sequence may include up to five point mutations per each 
100 nucleotides of the reference nucleotide sequence encoding the polypeptide. In other 

20 words, to obtain a polynucleotide having a nucleotide sequence at least 95% identical to 
a reference nucleotide sequence, up to 5% of the nucleotides in the reference sequence 
may be deleted or substituted with another nucleotide, or a number of nucleotides up to 
5% of the total nucleotides in the reference sequence may be inserted into the reference 
sequence. The query sequence may be an entire sequence shown inTable 1, the ORF 

25 (open reading frame), or any fragement specified as described herein. 

As a practical matter, whether any particular nucleic acid molecule or 
polypeptide is at least 90%, 95%, 96%, 97%, 98% or 99% identical to a nucleotide 
sequence of the presence invention can be determined conventionally using known 
computer programs. A preferred method for determing the best overall match between 

30 a query sequence (a sequence of the present invention) and a subject sequence, also 
referred to as a global sequence alignment, can be determined using the F ASTDB 
computer program based on the algorithm of Brutlag et al. (Comp. App.-Bicsci. (1990) 
6:237-245).. In a sequence alignment the query and subject sequences are both DNA 
sequences. An RNA sequence can be compared by converting U's to T's. The result 

35 of said global sequence alignment is in percent identity. Preferred parameters used in a 
FASTDB alignment of DNA sequences to calculate percent identiy are: 
Matrix=Unitary, k-tuple=4, Mismatch Penalty=i, Joining Penalty=30, Randomization 
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Group Length=0, Cutoff Score=l, Gap Penalty=5, Gap Size Penalty 0.05, Window 
Size=500 or the lenght of the subject nucleotide sequence, whichever is shorter. 

If the subject sequence is shorter than the query sequence because of 5 ? or 3' 
deletions, not because of internal deletions, a manual correction must be made to the 
5 results. This is becuase the FASTDB program does not account for 5 7 and 3 ? 

truncations of the subject sequence when calculating percent identity. For subject 
sequences truncated at the 5 ? or 3 ? ends, relative to the the query sequence, the percent 
identity is corrected by calculating the number of bases of the query sequence that are 5 ? 
and 3 7 of the subject sequence, which are not matched/aligned, as a percent of the total 

10 bases of the query sequence. Whether a nucleotide is matched/aligned is determined by 
results of the FASTDB sequence alignment. This percentage is then subtracted from 
the percent identity, calculated by the above FASTDB program using the specified 
parameters, to arrive at a final percent identity score. This corrected score is what is 
used for the purposes of the present invention. Only bases outside the 5* and 3' bases 

15 of the subject sequence, as displayed by the FASTDB alignment, which are not ■ 

matched/aligned with the query sequence, are calculated for the purposes of manually 
adjusting the percent identity score. ; 

For example, a 90 base subject sequence is aligned to a 100 base query 
sequence to determine percent identity. The deletions occur at the 5 7 end of the subject 

20 sequence and therefore, the FASTDB alignment does not show a matched/alignement of 
the first 10 bases at 5 ? end. The 10 unpaired bases represent 10% of the sequence 
(number of bases at the 5' and 3' ends not matched/total number of bases in the query 
sequence) so 10% is subtracted from the percent identity score calculated by the 
FASTDB program. If the remaining 90 bases were perfectly matched the final percent 

25 identity would be 90%. In another example, a 90 base subject sequence rs compared 
with a 100 base query sequence. This time the deletions are internal deletions so that 
there are no bases on the 5' or 3' of the subject sequence which are not matched/aligned 
with the query. In this case the percent identity calculated by FASTDB is not manually 
corrected. Once again, only bases 5 ? and 3 7 of the subject sequence which are not 

30 matched/aligned with the query sequnce are manually corrected for. No other manual 
corrections are to made for the purposes of the present invention. 

By a polypeptide having an amino acid sequence at least, for example, 95% 
"identical" to a query amino acid sequence of the present invention, it is intended that 
the amino acid sequence of the subject polypeptide is identical to the query sequence 

35 except that the subject polypeptide sequence may include up to five amino acid 

alterations per each 100 amino acids of the query amino acid sequence. In other words, 
to obtain a polypeptide having an amino acid sequence at least 95% identical to a. query 
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amino acid sequence, up to 5% of the amino acid residues in the subject sequence may 
be inserted, deleted, (indels) or substituted with another amino acid. These alterations 
of the reference sequence may occur at the amino or carboxy terminal positions of the 
reference amino acid sequence or anywhere between those terminal positions, 
interspersed either individually among residues in the reference sequence or in one or 
more contiguous groups within the reference sequence. 

As a practical matter, whether any particular polypeptide is at least 90%, 95%, 
96%, 97%, 98% or 99% identical to, for instance, the amino acid sequences shown in 
Table 1 or to the amino acid sequence encoded by deposited DNA clone can be 
determined conventionally using known computer programs. A preferred method for 
determing the best, overall match between a query sequence (a sequence of the present 
invention) and a subject sequence, also referred to as a global sequence alignment, can 
be determined using the FASTDB computer program based on the algorithm of Brutlag 
et al. (Comp. App. Biosci. ( 1990) 6:237-245). In a sequence alignment the query and 
subject sequences are either both nucleotide sequences or both amino acid sequences. 
The result of said global sequence alignment is in percent identity . Preferred parameters 
used in a FASTDB amino acid alignment are: Matrix=PAM 0, k-tuple=2, Mismatch 
' Penalty=l, Joining Penal ty=20. Randomization Group Length=0, Cutoff Score=l, 
Window Size=sequence length, Gap Penalty=5, Gap Size Penalty=0.05, Window 
Size=500 or the length of the subject amino acid sequence, whichever is shorter. 

If the subject sequence is shorter than the query sequence due to N- or C- 
terminal deletions, not because of internal deletions, a manual correction must be made 
to the results. This is becuase the FASTDB program does not account for N- and C- 
terminal truncations of the subject sequence when calculating global percent identity. 
For subject sequences truncated at the N- and C-terrnini, relative to the the query 
sequence,, the percent identity is corrected by calculating the number of residues of the 
query sequence that are N- and'C-terminal of the subject sequence, which are not 
matched/aligned with a corresponding subject residue, as a percent of the total bases of 
the query sequence. Whether a residue is matched/aligned is determined by results of 
the FASTDB sequence alignment. This percentage is then subtracted from the percent 
identity, calculated by the above FASTDB program using the specified parameters, to 
arrive at a final percent identity score. This final percent identity score, is what is used 
for the purposes of the present invention. Only residues to the N- and.C-termini of the 
subject sequence, which are not matched/aligned with the query sequence, are 
considered for the purposes of manually adjusting the percent identity score. That is, 
only query residue positions outside the farthest N- and C-terminal residues of the 
subject sequence. 
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For example, a 90 amino acid residue subject sequence is aligned with. a 100 
residue query sequence to determine percent identity. The deletion occurs at the N- 
terminus of. the subject sequence and therefore, the FASTDB alignment does not show 
a matching/alignment of the first 10 residues at the N-tenminus. The 10 unpaired 
5 residues represent 10% of the sequence (number of residues at the N- and C- termini 
not matched/total number of residues in the query sequence) so 10% is subtracted from 
the percent identity score calculated by the FASTDB program. If the remaining 90 
residues were perfectly matched the, final percent identity would be 90%. In another 
example, a 90 residue subject sequence is compared with a 100 residue query sequence. 

10 This time the deletions are internal deletions so there are no residues at the N- or C- 
termini of the subject sequence which are not matched/aligned with the query. In this 
case the percent identity calculated by FxASTDB is not manually corrected. Once again, 
only residue positions outside the N- and C-terminal ends of the subject sequence, as 
displayed in the FASTDB alignment, which are not matched/aligned with the query 

15 sequnce are manually corrected for. No other manual corrections are to made for the 
purposes of the present invention. 

The variants may contain alterations in the coding regions, non-coding regions, 
or both. Especially preferred are polynucleotide variants containing alterations which 
produce silent substitutions, additions, or deletions, but do not alter the properties or 

20 activities of the encoded polypeptide. Nucleotide variants produced by silent 

substitutions due to the degeneracy of the genetic code are preferred. Moreover, 
variants in which 5-10, 1-5, or 1-2 amino acids are substituted, deleted, or added in any 
combination are also preferred. Polynucleotide variants can be produced for a variety 
of reasons, e.g., to optimize codon expression for a particular host (change codons in 

25 the human mRNA to those preferred by a bacterial host such as E. coli). 

•Naturally occurring variants are called "allelic variants/' and refer to one of 
several alternate forms of a gene occupying a given locus on a chromosome of an 
organism. (Genes II, Lewin, B., ed., John Wiley & Sons, New York (1985).) These 
allelic variants can vary at either the polynucleotide and/or polypeptide level. 

30 Alternatively, non-naturally occurring variants may be produced by mutagenesis 
techniques or by direct synthesis. 

Using known methods of protein engineering and recombinant DNA 
technology, variants may be generated to improve or alter the characteristics of the 
polypeptides of the present jnvention. For instance, one or more amino acids can be 

35 deleted from the N-terminus or C-terminus of the secreted protein without substantial 
loss of biological function. The authors of Ron et al., J. Biol. Chem. 268: 2984-2988 
(1993), reported variant KGF proteins having heparin binding activity even after 
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deleting 3, 8, or 27 amino-terminal amino acid residues. Similarly, Interferon gamma 
exhibitedup to ten rimes higher activity after deleting 8-10 amino acid residues from the 
carboxy terminus of this protein. (Dobeli et aL J. Biotechnology 7: 199-216 (1988).) 
Moreover, ample evidence demonstrates that variants often retain a biological 
5 activity simiiar to that of the naturally occurring protein. For example, Gayle and 
coworkers (J. Biol. Chem 268:22105-221 1 1 (1993)) conducted extensive mutational 
analysis of human cytokine IL-la. They used random mutagenesis to generate over 
3,500 individual EL- la mutants that averaged 2.5 amino acid changes per variant over 
the entire length of the molecule. Multiple mutations were examined at every possible 
10 amino acid position. The investigators found that "[m]ost of the molecule could be 

altered with little effect on either [binding or biological activity]." (See, Abstract.) in 
fact, only 23 unique amino acid sequences, out of more than 3,500 nucleotide 
sequences examined, produced a protein that significantly differed in activity from wild- 
-type. 

15 Furthermore, even if deleting one or more amino acids from the N-terminus or 

C-terminus of a polypeptide results in modification or loss of one or more biological 
functions, other biological activities' may still be retained. For example, the ability of a 
deletion variant to induce and/or to bind antibodies which recognize the secreted foim 
will likely be retained when less than the majority of the residues of the secreted form 

20 are removed from the N-terminus or C-terminus. Whether a particular polypeptide 

lacking N- or C-terminal residues of a protein retains such immunogenic activities can 
readily be determined by routine methods described herein and otherwise known in the 
art. 

Thus, the invention further includes polypeptide variants which show 
25 substantial biological activity. Such variants include deletions, insertions, inversions, 
repeats, and substitutions selected according to general rules known in the an so as 
have little effect on activity. For example, guidance concerning how to make 
phenotypically silent amino acid substitutions is provided in Bowie, J. U. et aL, 
Science 247:1306-1310 (1990), wherein the authors indicate that there are two main 
30 strategies for studying the tolerance of an amino acid sequence to change. 

The first strategy exploits the tolerance of amino acid substitudons by natural 
selection during the process of evolution. By comparing amino acid sequences in 
different species, conserved amino acids can be identified. These conserved amino 
acids are likely important for protein function. In contrast, the amino acid positions 
35 where substitutions have been tolerated by natural selection indicates that these 

positions are not critical for protein function. Thus, positions tolerating amino acid 
substitution could be modified while still maintaining biological activity of the protein. 
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insert is generated according to the PCR protocol described in Example 1 , using PCR 
primers having restriction sites for Ndel (5 ? primer) and Xbal, BamHI ? XhoI ? or 
Asp718 (3' primer). The PCR insert is gel purified and restricted with compatible, 
enzymes. The insen and vector are ligated according to standard protocols. 
5 The engineered vector could easily be substituted in the above protocol to 

express protein in a bacterial system. 

Example 6: Purification of a Polypeptide from an Inclusion Bodv 

The following alternative method can be used to purify a polypeptide expressed 
10 in £ coli when it is present in the form of inclusion bodies. Unless otherwise specified, 

all of the following steps are conducted at 4- 10°C 

Upon compledon of the production phase of the £. coli fermentation, the cell 

culture is cooled to 4-10°C and the cells harvested by continuous centrifugation at 

15,000. rpm (Heraeus Sepatech). On the basis of the expected yield of protein per unit 
15 weight of cell paste and the amount of purified protein required, an appropriate amount 
of cell paste, by weight, is suspended in a buffer solution containing 100 mM Tris, 50 
mM EDTA, pH 7.4. The cells are dispersed to a homogeneous suspension using a 
high shear mixer. 

The cells are then lysed by passing the solution through a microfiuidizer 
20 (Microfuidics, Corp. or APV Gaulin, Inc.) twice at 4000-6000 psi. The homogenate is 
then mixed with NaCl solution to a final concentration of 0.5 M NaCI, followed by 
centrifugation at 7000 xg for 15 min. The resultant pellet is washed again using 0.5M 
NaCl, 100 mM Tris, 50 mM- EDTA, pH 7.4. 

The resulting washed inclusion bodies are solubilized with US M guanidine 
25 hydrochloride (GuHCl) for 2-4 hours. After 7000 xg centrifugation for 15 min., the 

pellet is discarded and the polypeptide containing supernatant is incubated at 4°C 

overnight to allow further GuHCl extraction. . 

Following high speed centrifugation (30,000 xg) to remove insoluble panicles, 
the GuHCl solubilized protein is refolded by quickly mixing the GuHCl extract with 20 
30 volumes of buffer containing 50 mM sodium, pH 4.5, 150 mM NaCl, 2 mM EDTA by 

vigorous stirring. The refolded diluted protein soluuon is kept at 4°C without mixing 

for 12 hours prior to further purification steps. 

To clarify the refolded polypepdde solution, a previously prepared tangential 

filtration unit equipped with 0. 16 urn membrane filter with appropriate surface area 
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(Isopropyl-B-D-thiogaiacto pyranoside) is then added to a final concentration of i mM. 
IPTG induces by inactivating the iacl repressor, clearing the P/O leading to increased 
gene expression. 

Ceils are grown for an extra 3 to 4 hours. Cells are then harvested by 
centrifugation (20 mins at 6000Xg). The ceil pellet is solubilized in the chaotropic 

agent 6 Molar Guanidine HQ by stirring for 3-4 hours at 4°C The cell debris is 

removed by centrifugation 7 and the supernatant containing the polypeptide is loaded 
onto a nickel-nitrilo-tri-acetic acid ("Ni-NTA") affinity resin column (available from 
QIAGEN, Inc., supra). Proteins with a 6 x His tag bind to the Ni-NTA resin with high 
affinity and can be purified in a simple one-step procedure (for details see: The 
QlAexpressionist (1995) QIAGEN, Inc., supra). 

Briefly, the supernatant is loaded onto the column in 6 M guanidine-HCl, pH 8, 
the column is first washed with 10 volumes of 6 M guanidine-HCl, pH 8, then washed 
with 10 volumes of 6 M guanidine-HCl pH 6 r and finally the polypeptide is eluted with 
6 M guanidine-HCl, pH 5. 

The purified protein is then renatured by dialyzing it against phosphate-buffered 
saline (PBS) or 50 mM Na-acetate, pH 6 buffer plus 200 mM NaCl. Alternatively, the 
• protein can be successfully refolded while immobilized on. the Ni-NTA column. The 
recommended conditions are as follows: renature using a linear 6M-1M urea gradient in 
500 mM NaCl, 20% glycerol, 20 mM Tris/HCl pH 7.4, containing protease inhibitors. 
The renaturation should be performed over a period of 1 .5 hours or more. After 
renaturation the proteins are eluted by the addition of 250 mM immidazole. Immidazole 
is removed by a final dialyzing step against PBS or 50 mM sodium acetate pH 6 buffer 
plus 200 mM NaCl. The purified protein is stored at 4° C or frozen at -80° C. 

In addition to the above expression vector, the present invention further includes 
an expression vector comprising phage operator and promoter elements operatively 
linked to a polynucleotide of the present invention, called pHE4a. (ATCC Accession 
Number 209645, deposited on February 25, 1998.) This vector contains: 1) a 
neomycinphosphotransferase gene as a selection marker, 2) an E. coli origin of 
replication, 3) a T5 phage promoter sequence, 4) two lac operator sequences, 5) a 
Shine-Delgarno sequence, and 6) the lactose operon repressor gene (laclq). The origin 
of replication (oriC) is derived from pUC19 (LTI, Gaithersburg, MD). The promoter 
sequence and operator sequences are made synthetically. 

DNA can be inserted into the pHEa by restricting the vector with Ndel and 
Xbal, BamHI, Xhol, or Asp718, running the restricted product on a gel, and isolating 
the larger fragment (the sniffer fragment should be about 310 base pairs). The DNA 
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Example 4: Chromosomal Mapping of the Polynucleotides 

An oligonucleotide primer sec is designed according to the sequence at the 5* ■ 
end of SEQ ID NO:X. This primer preferably spans about 100 nucleotides. This 
5 primer set is then used in a polymerase chain reaction under the following sec of 

conditions : 30 seconds, 95°C; 1 minute, 56°C; 1 minute, 70°C. This cycle is repeated 

32 times followed by one 5 minute cycle at 70°C Human, mouse, and hamster DNA 

is used as template in addition to a somatic cell hybrid panel containing individual 
chromosomes or chromosome fragments (Bios, Inc). The reactions is analyzed on 
10 either 8% polyacryiamide gels or 3.5 % agarose geis. Chromosome mapping is 

determined by the presence of an approximately 100 bp PGR fragment in the particular 
somatic cell hybrid. 

Example 5: Bacterial Expression of a Polypeptide 

15 A polynucleodde encoding a polypeptide of the present invention is amplified 

using PCR oligonucleotide primers corresponding to the 5 r and 3' ends of the DNA 
sequence, as oudined in Example 1, to synthesize insertion fragments. The primers 
used to amplify the cDNA insert should preferably contain restriction sites, such as 
BamHI and Xbal, at the 5' end of the primers in order to clone the amplified product 

20 into the expression vector. For example, BamHI and Xbal correspond to the restriction 
enzyme sites on the bacterial expression vector pQE-9. (Qiagen, Inc., Chatswonh, 

CA). This plasmid vector encodes antibiotic resistance (Amp 1 *), a bacterial origin of 
replication (ori), an.IPTG-regulatable promoter/operator (P/O), a.ribosome binding site 
(RBS), a 6-histidine tag (6-His), and restriction enzyme cloning sites. 
25 The pQE-9 vector is digested with BamHI and Xbal and the amplified fragment 

is ligated into the pQE-9 vector maintaining the reading frame initiated at the bacterial 
RBS. The ligation mixture is then used to transform the E. coli strain M15/rep4 
(Qiagen, Inc.) which contains multiple copies of the plasmid pREP4. which expresses 

the lad repressor and also confers kanamycin resistance (Kan 1 *)- Transformants are 
30 identified by their ability to grow on LB plates and ampicillin/kanamycin resistant 

colonies are selected. Plasmid DNA is isolated and confirmed by restriction analysis. 
Clones containing the desired constructs are grown overnight (O/N) in liquid 

culture in LB media supplemented with both Amp (100 ug/ml) and Kan (25 ug/ml). 

The O/N culture is used to inoculate a large culture at a rado of 1: 100 to 1 :250. The 
35 ceils are grown to an optical density 600 (O.D. 600 ) of between 0.4 and 0.6. IPTG 
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This above method starts wich total RNA isolated from the desired source, 
although poly-A+ RNA can be used. The RNA preparation can then be treated with 
phosphatase if necessary to eliminate 5' phosphate groups on degraded or damaged 
RNA which may interfere with the later RNA Iigase step. The phosphatase should then 
5 be inactivated and the RNA treated wich tobacco acid pyrophosphatase in order to 
remove the cap structure present at the 5' ends of messenger RNAs. This reaction 
leaves a 5 T phosphate group at the 5 ? end of the cap cleaved RNA which can then be 
ligated to an RNA oligonucleotide using T4 RNA ligase. 

This modified RNA preparation is used as a template for first strand cDNA 
10 synthesis using a gene specific oligonucleotide. The first strand synthesis reaction is 

used as a template for PCR amplification of the desired 5 T end using a primer specific to 
the ligated RNA oligonucleotide and a primer specific to the known sequence of the 
gene of interest. The resultant product is then sequenced and analyzed to confirm that 
the 5' end sequence belongs" to the desired gene. 

15 

Example 2: Isolation of Genomic Clones Corresponding to a 
Polynucleotide 

A human genomic Pi library (Genomic Systems, Inc.) is screened by PCR 
using primers selected for the cDNA sequence corresponding to SEQ ID NO:X., 
20 according to the method described in Example 1. (See also, Sambrook.) 

Example 3: Tissue Distribution of Polypeptide 

Tissue distribution of mRNA expression of polynucleotides of the present 
invention is determined using protocols for Northern blot analysis, described by, 

25 among others, Sambrook et ai. For example, a cDN A probe produced by the method 
described in Example 1 is labeled with P 32 using the rediprime™ DN A labeling system 
(Amersham Life Science), according to manufacturer's instructions. After labeling, the 
probe is purified using CHROMA SPIN- 100™ column (Clontech Laboratories, Inc.), 
according to manufacturers protocol number PT 1200-1. The purified labeled probe is 

30 then used to examine various human tissues for mRNA expression. 

Multiple Tissue Northern (MTN) blots containing various human tissues (H) or 
human immune system tissues (IM) (Clontech) are examined with the labeled probe 
using ExpressHyb™ hybridization solution (Clontech) according to manufacturer's 
protocol number PT1 190-1. Following hybridization and washing, the blots are 

35 mounted and exposed to film at -70°C overnight, and the. films developed according to 

standard procedures. 
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The plasmid mixture is transformed into a suitable host, as indicated above (such as 
XL- 1 Blue (Stratagene)) using techniques known to those of skill in the art, such as 
those provided by the vector supplier or in related publications or patents cited above. 
The transformants are plated on 1.5 % agar plates (containing the appropriate selection 
5 agent, e.g., ampicillin) to a density of about 150 transformants (colonies) per plate. 
These plates are screened using Nylon membranes according to routine methods for 
bacterial colony screening (e.g., Sambrook et ah, Molecular Cloning: A Laboratory 
Manual, 2nd Edit., (-1989), Cold Spring Harbor Laboratory Press, pages 1.93 to 
1 . 104), or other techniques known to those of skill in the art. 

10 Alternatively, two primers of 17-20 nucleotides derived from both ends of the 

SEQ ID NO:X (i.e., within the region of SEQ ID NO.X bounded by the 5' NT and the 
3' NT of the clone defined in Table 1) are synthesized and used to amplify the desired 
cDNA using the deposited cDNA plasmid as a template. The polymerase chain reaction 
is carried out under routine conditions, for instance, in 25 |Ltl of reaction mixture with 

15 0.5 ug of the above cDNA template. A convenient reaction mixture is 1 .5-5 mM 

MgCl,, 0.01% (w/v) gelatin, 20 uM each of dATP, dCTP, dGTP, dTTP, 25 pmol of 
each primer and 0.25 Unit of Taq polymerase. Thirty five cycles of PCR (denaturation 
at 94°C for 1 min; annealing at 55°C for 1 min; elongation at 72°C for 1 min) are 
performed with a Perkin-Elmer Cetus automated thermal cycler. The amplified product 

20 is analyzed by agarose gel electrophoresis and the DNA band with expected molecular 
weight is excised and purified. The PCR product is verified to be the selected sequence 
by subcloning and sequencing the DNA product. 

Several methods are available for the identification of the 5 r or 3' non-codin^ 
portions of a gene which may not be present in the deposited done. These methods 

25 include but are not-limited to, filter probing, clone enrichment using specific probes, 

and protocols similar or identical to 5' and 3' "RACE" protocols which are well known 
in the an. For instance, a method similar to 5' RACE is available for generating the 
missing 5' end' of a desired full-length transcript. (Fromont-Racine. et al., Nucleic Acids 
Res. 2 1(7): 1683- 1684 (1993).) 

30 Briefly, a specific RNA'oligonucleotide is ligated to the 5' ends of a population 

of RNA presumably containing full-length gene RNA transcripts. A primer set 
containing a primer specific to the ligated RNA oligonucleotide and a primer specific to 
a known sequence of the gene of interest is used to PCR amplify the 5' portion of the 
desired full-length gene. This amplified product may then be sequenced and used to 

35 generate the full length gene. 
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Blue, also available from Scratagene. pBS comes in 4 forms SK-K SK-, KS-r and KS. 
The S and K refers co the orientation of the polylinker to the T7 and T3 primer 
sequences which flank the polylinker region ("S" is for SacI and "K" is for Kpnl which 
are the first sites on each respective end of the linker). "V or "-" refer to the orientation 
5 of the fl origin of replication ("ori"), such that in one orientation, single stranded rescue 
initiated from the f 1 ori generates sense strand DNA and in the other, antisense. 

Vectors pSportl, pCMVSport 2.0 and pCMVSport 3.0, were obtained from 
Life Technologies, Inc., P. O. Box 6009, Gaithersburg, MD 20897. All Sport vectors 
contain an ampicillin resistance gene and may be transformed into E. coli strain 

10 DH10B, also available from Life Technologies. (See. for instance, Gruber, C. E., et 
al., Focus 15:59 (1993).) Vector lafmid BA (Bento Soares, Columbia Lmiversity, NY) 
contains an ampicillin resistance gene and can be transformed into E. coli strain XL-1 
Blue. Vector pCR 0 2.1, which is available from Invitrogen, 1600 Faraday Avenue, 
Carlsbad, CA 92008, contains an ampicillin resistance gene and may be transformed 

15 into E. coli strain DH10B, available from Life Technologies. (See, for instance, Clark, 
J. M., Nuc. Acids Res. 16:9677-9686 ( 1988) and Mead, D. ec al., Bio/Technology 9: 
( 1991).) Preferably, a polynucleotide of the present invention does not comprise the 
phage vector sequences identified for the particular clone in Table i, as well as the 
corresponding plasmid vector sequences designated above. 

20 The deposited material in the sample assigned the ATCC Deposit Number cited 

in Table 1 for any given cDNA clone also may contain one or more additional plasmids, 
each comprising a cDNA clone different from that given clone. Thus, deposks sharing 
the same ATCC Deposit Number contain at least a plasmid for each cDNA clone 
identified in Table 1. Typically, each ATCC deposit sample cited in Table 1 comprises 

25 a mixture of approximately equal amounts (by weight) of about 50 plasmid DNAs, each 
containing a different cDNA clone; but such a deposit sample may include plasmids for 
more or less than 50 cDNA clones, up to about 500 cDNA clones. 

Two approaches can be used to isolate a particular clone from the deposited 
sample of plasmid DNAs cited for that clone in Table 1 . First, a plasmid is directly 

30- isolated by screening the clones using a polynucleotide probe corresponding to SEQ ID 
NO:X. 

Particularly, a specific polynucleotide with 30-40 nucleotides is synthesized 
using an Applied Biosystems DNA synthesizer according to the sequence reported. 

The oligonucleotide is labeled, for instance, with 32 P-y-ATP using T4 polynucleotide 

35 kinase and purified according to routine methods. (E.g., Maniatis et al.. Molecular 

Cloning: A Laboratory Manual, Cold Spring Harbor Press, Cold Spring, NY (1982).) 
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Also preferred is a method of treatment of an- individual in need of an increased 
level of a secreted protein activity, which method comprises administering, to such an 
individual a pharmaceutical composition comprising an amount of an isolated 
polypeptide, polynucleotide, or antibody of the claimed invention effective to increase 
5 the level of said protein activity in said individual. 

Having generally described the invention, the same will be more readily 
understood by reference to the follow ing examples, which are provided by way of 
illustration and are not intended as limiting. 

10 Examples 

Example 1: Isolation of a Selected cDNA Clone From the Deposited 
Sample 

Each cDNA clone in a cited ATCC deposit is contained in a plasmid vector. 
15 Table 1 identifies the vectors used to construct the cDNA library from which each clone 
was isolated. In many cases, the vector used to construct the library is a phage vector 
from which a' plasmid has been excised. The table immediately below correlates the 
related plasmid for each phage vector used in constructing the cDNA library. For 
example, where a particular clone is identified in Table 1 as being isolated in the vector 
20 "Lambda Zap," the corresponding deposited clone is in, "pBluescript." 

Vector Used to Construct Libran/ Corresponding Deposited Plasmid 

Lambda Zap pBluescript (pBS) 

Uni-ZapXR pBluescript (pBS) - 

Zap Express pBK 
25 lafmidBA plafmid BA 

pSportl pSportl 
pCMVSport2.0 pCMVSport 2.0 

pCMVSport 3.0 pCMVSport 3.0 

pCR®2.1 " pCR r5> 2.1 
30 Vectors Lambda Zap (U.S. Patent Nos. 5,128,256 and 5,2S6;636), Uni-Zap 

XR (U.S. Patent Nos. 5,128, 256 and 5,286,636), Zap Express (U.S. Patent Nos. 
5,128,256 and 5,286,636), pBluescript (pBS) (Short, J. M. et al„ Nucleic Acids Res. 
l6:75S3-7600 (1988); Alting-Mees, M. A. and Short, J. M., Nucleic Acids Res. 
17:9494 (1989)) and pBK (Alting-Mees, M. A. et al., Strategies 5:58-61 (1992)) are 
35 commercially available from Stratagene Cloning Systems, Inc., 1 101 1 N. Toney Pines 
Road, La Jolla, CA, 92037. pBS contains an ampicillin resistance gene and pBK 
contains a neomycin resistance gene. Both can be transformed into E. coli strain XL-1 
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Also preferred is an isolated nucleic acid molecule comprising a nucleotide 
sequence which is at least 95% identical to a nucleotide sequence encoding a 
polypeptide wherein said polypeptide comprises an amino acid sequence that is at least 
90% identical to a sequence of at least 10 contiguous amino acids in a sequence selected 
5 from the group consisting of: an amino acid sequence of SEQ ID NO.Y wherein Y is 
any integer as defined in Table 1 ; and a complete amino acid sequence of a secreted 
protein encoded by a human cDNA clone identified by a cDNA Clone Identifier in Table 
1 and contained in the deposit with the ATCC Deposit Number shown for said cDNA 
clone in Table 1 . 

10 Also preferred is an isolated nucleic acid molecule, wherein said nucleotide 

sequence encoding a polypeptide has been optimized for expression of said polypeptide 
in a prokaryotic host. 

Also preferred is an isolated-nucleic acid molecule, wherein said polypeptide 
comprises an amino acid sequence selected from the group consisting of: an amino acid 

15 sequence of SEQ ID NO:Y wherein Y is any integer as defined in Table 1; and a 

complete amino acid sequence of a secreted protein encoded by a human cDNA clone 
identified by a cDNA Clone Identifier in Table 1 and contained in the deposit with the 
ATCC Deposit Number shown for said cDNA clone in Table 1 . 

Further preferred is a method of making a recombinant vector comprising 

20 inserting any of the above isolated nucleic acid molecule into a vector. Also preferred is 
the recombinant vector produced by this method. Also preferred is a method of making 
a recombinant host cell comprising introducing the vector into a host cell, as well as the 
recombinant host cell produced by this method. 

Also preferred is a method of making an isolated polypeptide comprising 

25 culturing this recombinant host cell under conditions such that said polypeptide is 

expressed and recovering said polypeptide. Also preferred is this method of making an 
isolated polypeptide, wherein said recombinant host cell is a eukaryotic cell and said 
polypeptide is a secreted portion of a human secreted protein comprising an amino acid 
sequence selected from the group consisting of: an amino acid sequence of SEQ ID 

30 NO:Y beginning with the residue at the position of the First Amino Acid of the Secreted 
Portion of SEQ ID NO.Y wherein Y' is an integer set forth in Table 1 and said position 
of the First Amino Acid of the Secreted Portion of SEQ ID NO:Y' is defined in Table 1; 
and an amino acid sequence of a secreted portion of a protein encoded by a human 
cDNA clone identified by a cDNA Clone Identifier in Table 1 and contained in the 

35 deposit with the ATCC Deposit Number shown for said cDNA. clone in Table 1. The 
isolated polypeptide produced by this method is also preferred. 
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comprising an amino acid sequence that is at least 90% identical to a sequence of at least 
10 contiguous amino acids in a sequence selected from the group consisting of: an 
amino acid sequence of SEQ ID NO:Y wherein Y is any integer as defined in Table I; 
and a complete amino acid sequence of a protein encoded by a human cDNA clone 
5 identified by a cDNA Clone Identifier in Table 1 and contained in the deposit with the 
ATCC Deposit Number shown for said cDN A clone in Table 1 . 

.Also preferred is the above method wherein said step of comparing sequences is 
performed by comparing the amino acid sequence determined from a polypeptide 
molecule in said sample with said sequence selected from said group. 
10 Also preferred is a method for identifying the species, tissue or cell type of a 

biological sample which method comprises a step of detecting poiypepude molecules in 
« said sample, if any, comprising an amino acid sequence that is at least 90% identical to 
a sequence of at least 10 contiguous amino acids in a sequence selected from the group 
consisting of: an amino acid sequence of SEQ ID NO:Y wherein Y is any integer as 
15 defined in Table 1; and a complete amino acid sequence of a secreted protein encoded 

by a human cDNA clone identified by a cDNA Clone Identifier in Table 1 and contained 
in the deposit with the ATCC Deposit Number shown for said cDNA clone in Table 1. 

Also preferred is the above method for identifying the species, tissue or cell type 
of a biological sample, which method comprises a step of detecting polypeptide 
20 molecules comprising an amino acid sequence in a panel of at least two amino acid 
sequences, wherein at least one sequence in said panel is at least 90% identical to a 
sequence of at least 10 contiguous amino acids in a sequence selected from the above 
. group. 

Also preferred is a method for diagnosing in a subject a pathological condition 
25 associated with abnormal structure or expression of a gene encoding a secreted protein 
identified in Table 1, which method comprises a step of detecting fn a biological sample 
obtained from said subject polypeptide molecules comprising an amino acid sequence in 
a panel of at least two amino acid sequences, wherein at least one sequence in said panel 
is at least 90% identical to a sequence of at least 10 contiguous amino acids in a 
30 sequence selected from the group consisting of: an amino acid sequence of SEQ ID ' 
NO:Y wherein Y is any integer as defined in Table 1; and a complete amino acid 
sequence of a secreted protein encoded by a human cDNA clone identified by a cDNA 
Clone Identifier in Table 1 and contained in the deposit with the ATCC Deposit Number 
shown for said cDNA clone in Table I. 
35 In any of these methods, the step of detecting said polypeptide molecules 

includes using an antibody. 
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Also preferred is an isolated polypeptide comprising an amino acid sequence at 
least 95% identical to a sequence of at least about 30 contiguous amino acids in the 
amino acid sequence of the secreted portion of the protein encoded by a human cDNA 
clone identified by a cDNA Clone Identifier in Table 1 and contained in the deposit with 
5 the ATCC Deposit Number shown for said cDNA clone in Table 1. 

Also preferred is an isolated polypeptide comprising an amino acid sequence at 
least 95% identical to a sequence of at least about 100 contiguous amino acids in the 
amino acid sequence of the secreted portion of the protein encoded by a human cDNA 
clone identified by a cDNA Clone Identifier in Table 1 and contained in the deposit with 
10 the ATCC Deposit Number shown for said cDNA clone in Table 1 . 

Also preferred is an isolated polypeptide comprising an amino acid sequence at 
least 95% identical to the amino acid sequence of the secreted portion of the protein 
encoded by a human cDNA clone identified by a cDNA Clone Identifier in Table 1 and 
contained in the deposit with the <ATCC Deposit Number shown for said cDNA clone in 
15 Table 1. 

Further preferred is ah isolated antibody which binds specifically to a 
polypeptide comprising an amino aci t d sequence that is at least 90% identical to a 
sequence of at least 10 contiguous amino acids in a sequence selected from the group 
consisting of: an amino acid sequence of SEQ ID NO: Y wherein Y is any integer as 

20 defined in Table 1 ; and a complete amino acid sequence of a protein encoded by a 

human cDNA clone identified by a cDNA Clone Identifier in Table 1 and contained in 
- the deposit with the ATCC Deposit Number shown for said cDNA clone in Table 1. 

Further preferred is a method for detecting in a biological sample a polypeptide 
comprising an amino acid sequence which is at least 90% identical to a sequence of at 

25 least 10 contiguous amino acids in a sequence selected from the group, consisting of: an 
amino acid sequence of SEQ ID NO:Y wherein Y is any integer as defined in Table 1; 
and a complete amino acid sequence of a protein encoded by a human cDNA clone 
identified by a cDN A Clone Identifier in Table 1 and contained in the deposit with the 
ATCC Deposit Number shown for said cDNA clone in Table 1;. which method 

30 comprises a step of comparing an amino acid sequence of at least one polypeptide 
molecule in said sample with a sequence selected from said group and determining 
whether the sequence of said polypeptide molecule in said sample is at least 90% 
identical to said sequence of at least 10 contiguous amino acids. 

Also prefenred is the above method wherein said step of comparing an amino 

35 acid sequence of at least one polypeptide molecule in said sample with a sequence 
selected from said group comprises determining the extent of specific binding of 
polypeptides in said sample to an antibody which binds specifically to a polypeptide '- 
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identical to a sequence of at least 50 contiguous nucleotides in a sequence selected from 
said group. 

Also preferred is a composition of matter comprising isolated nucleic acid 

molecules wherein the nucleotide sequences of said nucleic acid molecules comprise a 
5 panel of at least two nucleotide sequences, wherein at least one sequence in said panel is 

at least 95% identical to a sequence of at least. 50 contiguous nucleotides in a sequence 

selected from the group consisting of: a nucleotide sequence of SEQ. ID NO:X wherein 

X is any* integer as' defined in Table 1 ; and a nucleotide sequence encoded by a human ^ 

cDNA clone identified by a cDNA Clone Identifier in Table 1 and contained in the 
10 deposit with the ATCC Deposit Number shown for said cDNA clone in Table 1 . The 

nucleic acid molecules can comprise DNA molecules or RNA molecules. 

Also preferred is an isolated polypeptide comprising an amino acid sequence at' 

least 90% identical to a sequence of at least about 10 contiguous amino acids in the 

amino acid sequence of SEQ ID NO:Y wherein Y is any integer as defined in Table 1. 
15 Also preferred is a polypeptide, wherein said sequence of contiguous amino 

acids is included in the amino acid sequence of SEQ ID NO:Y in the range of positions 

beginning with the residue at about' the position of the First Amino Acid of the Secreted 

Portion and ending with the residue at about the Last Amino Acid of the Ooen Reading 

Frame as set forth for SEQ ID NO: Y in Table 1 . 
20 Also preferred is an isolated polypeptide comprising an amino acid sequence at 

least 95% identical to a sequence of at least about 30 contiguous amino acids in the 

amino acid sequence of SEQ ID NO: Y. 

Further preferred is an isolated polypeptide comprising an amino acid sequence 

at least 95% identical to a sequence of at least about 1Q0 contiguous amino acids in the 
25 amino acid sequence of SEQ ID NO:Y. 

Further preferred is an isolated polypeptide comprising an amino, acid sequence 

at least 95% identical to the complete amino acid sequence of SEQ ID NO:Y. 

Further preferred is an isolated polypeptide comprising an amino acid sequence 

at least 90% identical to a sequence of at least about 10 contiguous amino acids in the 
30 complete amino acid sequence of a secreted protein encoded by a human cDNA clone 

identified by a cDNA Clone Idendfier in Table I and contained in the deposit with the 

ATCC Deposit Number shown for said cDNA clone in Table 1 . 

Also preferred is a polypepdde wherein said sequence of contiguous amino 

acids is included in the amino acid sequence of a secreted portion of the secreted protein 
35 encoded by a human cDNA clone identified by a cDNA Clone Identifier in Table 1 and 

contained in the deposit with the ATCC Deposit Number shown for said cDNA clone in 

Table 1. ■ 
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comprises a step of comparing a nucieocide sequence of ac least one nucleic acid 
molecule in said sample with a sequence selected from said group and determining 
whether the sequence of said nucleic acid molecule in said sample is at least 95% 
identical to said selected sequence. 
5 Also preferred is the above method wherein said step of comparing sequences 

comprises determining the extent of nucleic acid hybridization between nucleic acid 
molecules in said sample and a nucleic acid molecule comprising said sequence selected 
from said group. Similarly, also preferred is the above method wherein said step of 
comparing sequences is performed by comparing the nucleotide sequence determined 
10 from a nucleic acid molecule in said sample with said sequence selected from said 

group. The nucleic acid molecules can comprise DNA molecules or RNA molecules. 

A further preferred embodiment is a method for identifying the species, tissue or 
ceil type of a biological sample which method comprises a step of detecting nucleic acid 
molecules in said sample, if any, comprising a nucleotide sequence that is at least 95% 
15 identical to a sequence of at least 50 contiguous nucleotides in a sequence selected from 
the group consisting of: a nucleotide sequence of SEQ ID NO:X wherein X is any 
integer as defined in. Table 1; and a nucleotide sequence encoded by a human cDNA 
clone identified by a cDNA Clone Identifier in Table 1 and contained in the deposit with 
the ATCC Deposit Number shown for said cDNA clone in Table 1. 
20 . Tne method for identifying the species, tissue or cell type of a biological sample 

can comprise a step of detecting nucleic acid molecules comprising a nucleotide 
sequence in a panel of at least two nucleotide sequences, wherein at least one sequence 
in said panel is at least 95% identical to a sequence of at least 50 contiguous nucleotides 
in a sequence selected from said group. 
25 Also preferred is a method for diagnosing in a subject a pathological condition 

associated with abnormal. structure or expression of a gene encoding a secreted protein 
identified in Table L which method comprises a step of detecting in a biological sample 
obtained from said subject nucleic acid molecules, if any, comprising a nucleotide 
sequence that is at least 95% identical to a sequence of at least 50 contiguous 
30 nucleotides in a sequence selected from the group consisting of: a nucleotide sequence 
of SEQ ID NO:X wherein X is any integer as defined in Table 1 ; and a nucleotide 
sequence encoded by a human cDNA clone identified by a cDNA Clone Identifier in 
Table I and contained in the deposit with the ATCC Deposit Number shown for said 
cDNA clone in Table 1-. 
35 The method for diagnosing a pathological condition can comprise a step of 

detecting nucleic acid molecules comprising a nucleotide sequence in a panel of at least 
two nucleotide sequences, wherein at least one sequence in said panel is at least 95% 
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A further preferred embodiment is an isolated nucleic acid molecule comprising 
a nucleotide sequence which is at least 95% identical to the complete nucleotide 
sequence of SEQ ID NO:X. 

Also preferred is an isolated nucleic acid molecule which hybridizes under 
5 stringent hybridization conditions to a nucleic acid molecule, wherein said nucleic acid 
molecule which hybridizes does not hybridize under stringent hybridization conditions 
to a nucleic acid molecule having a nucleotide sequence consisting of only A residues or 
of only T residues. 

Also preferred is a composition of matter comprising a DNA molecule which 
10 comprises a human cDNA clone identified by a cDNA Clone Identifier in Table 1, 
which DNA molecule is contained in the material deposited with the American Type 
Culture Collection and given the ATCC Deposit Number shown in Table 1 for said 
cDNA Clone Identifier. 

Also preferred is an isolated nucleic acid molecule comprising a nucleotide . 
15 sequence which is at least 95% identical to a sequence of at least 50 contiguous 

nucleotides in the nucleotide sequence of a human cDNA clone identified by a cDNA 
Clone Identifier in Table 1 , which DNA molecule is contained in the deposit given the 
ATCC Deposit Number shown in Table 1 . 

Also preferred is an isolated nucleic acid molecule, wherein said sequence of at 
20 least 50 contiguous nucleotides is included in the nucleotide sequence of the complete 
open reading frame sequence encoded by said human cDNA clone. 

Also preferred is an isolated nucleic acid molecule comprising a nucleotide 
sequence which is at least 95% identical to sequence of at least 150 contiguous 
nucleotides in the nucleodde sequence encoded by said human cDNA clone. 
25 A further preferred embodiment is an isolated nucle ic acid molecule comprising 

a nucleotide sequence which is at least 95% identical to sequence of at least 500 
contiguous nucleotides in the nucleotide sequence encoded by said human cDNA clone. 

A further preferred embodiment is an isolated nucleic acid molecule comprising 
a nucleotide sequence which is at least 95% identical to the complete nucleotide 
30 sequence encoded by said human cDNA clone. 

• A further preferred embodiment is a method for detecting in a biological sample 
a nucleic acid molecule comprising a nucleodde sequence which is at least 95% identical 
to a sequence of at least 50 contiguous nucleotides in a sequence selected from the 
group consisting of: a nucleotide sequence of SEQ ID NO:X wherein X is any integer 
35 as defined in Table 1; and a nucleodde sequence encoded by a human cDNA clone 

identified by a cDNA Clone Identifier in Table 1 and contained in the deposit with the 
ATCC Deposit Number shown for said cDNA clone in Table I; which method 
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Other Preferred Embodiments 

Other preferred embodiments of the claimed invention include an isolated 
nucleic acid molecule comprising a nucleotide sequence which is at least 95% identical 

5 to a sequence of at least about 50 contiguous nucleotides in the nucleotide sequence of 
SEQ ID NO:X wherein X is any integer as defined in Table 1. 

Also preferred is a nucleic acid molecule wherein said sequence of contiguous 
nucleotides is included in the nucleotide sequence of SEQ ID NO:X in the range of 
positions beginning with the nucleotide at about the position of the 5 7 Nucleotide of the 

10 Clone Sequence and ending with the nucleotide at about the position of the 3' 
Nucleotide of the Clone Sequence as defined for SEQ ID NO:X in Table 1 . 

Also preferred "is a nucleic acid" molecule wherein said sequence of contiguous 
nucleotides is included in the nucleotide sequence of SEQ ID NO:X in the range of 
positions beginning with the nucleotide at about the position of the 5' Nucleotide of the 

15 Start Codon and ending with the nucleotide at about thi position of the 3' Nucleotide of 
the Clone Sequence as defined for SEQ ID NO:X in Table 1 . 

Similarly preferred *is a nucleic acid molecule wherein said sequence of 
contiguous nucleotides is included in the nucleotide sequence of SEQ ID NO:X in the 
range of positions beginning with the nucleotide at about the position of the 5 T 

20 Nucleotide of the First Amino Acid of the Signal Peptide and ending with the nucleotide 
at about the position of the 3' Nucleotide of the Clone Sequence as defined for SEQ ID 
NO:X in Table 1. 

Also preferred is an isolated nucleic acid molecule comprising a nucleotide 
sequence" which is at least 95% identical to a sequence of at least about 150 contiguous 
25 nucleotides in the nucleotide sequence of SEQ ID NO:X. 

Further preferred is an isolated nucleic acid molecule comprising a nucleotide 
sequence which is at least 95% identical to a sequence of at least about 500 contiguous 
nucleotides in the nucleotide sequence of SEQ ID NO:X. 

A further preferred embodiment is a nucleic acid molecule comprising a 
30 nucleotide sequence which is at least 95% identical to the nucleotide sequence of SEQ 
ID NO:X beginning with the nucleotide at about the position of the 5 7 Nucleotide of the 
First .Amino Acid of the Signal Peptide and ending with the nucleotide at about the 
position of the 3' Nucleotide of the Clone Sequence as defined for SEQ ID NO:X in 
Table I. 
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Preferably, an ELISA assay can measure polypeptide level or activity in a . 
sample (e.g., biological sample) using a monoclonal or polyclonal antibody. The 
antibody can measure polypeptide level or activity by either binding, directly or 
indirectly, to the polypeptide or by competing with the polypeptide for a substrate. 
5 All of these above assays can be used as diagnostic or prognostic markers. The 

molecules discovered using these assays can be used to treat disease or to bring about a 
particular result in a patient (e.g., blood vessel growth) by activating or inhibiting the 
polypeptide/molecule. Moreover, the assays can discover agents which may inhibit or 
enhancethe production of the polypeptide from suitably manipulated cells or tissues. 

10 Therefore,- the invention includes a method of identifying compounds which 

bind to a polypeptide of the invention comprising the steps of: (a) incubating a 
candidate binding compound with a polypepdde of the invention; and (b) determining if 
binding has occurred. Moreover, the invention includes a method of identifying 
agonists/antagonists comprising the steps of: (a) incubating a candidate compound with 

15 a polypeptide of the invention, (b) assaying a biological activity , and (b) determining if 
a biological activity of the polypeptide has been altered. 

Other Activities 

A polypepdde or polynucleotide of the present invention may also increase or 
20 decrease the differentiation or proliferation of embryonic stem cells, besides, as 

discussed above, hematopoietic lineage. 

A polypepdde or polynucleotide of the present invention may also be used to 

modulate mammalian characteristics, such as body height, weight, hair color, eye color, 

skin, percentage of adipose tissue, pigmentation, size, and shape (e.g., cosmetic > 
25 .surgery). Similarly, a polypeptide or polynucleotide of the present invention may be 

used to modulate mammalian metabolism affecting catabolism, anabolism, processing, 

utilization, and storage of energy. 

A polypeptide or polynucleotide of the present invenuon may be used to change 

a mammal's mental state or physical state by influencing biorhythms, caricadic 
30 rhythms, depression (including depressive disorders), tendency for violence, tolerance 

for pain, reproductive capabilities (preferably by Acdvin or Inhibin-like activity), 

hormonal or endocrine levels, appetite, libido, memory, stress, or other cognitive 

qualities. 

A polypepdde or polynucleotide of the present invention may also be used as a 
35 food additive or preservative, such as to increase or decrease storage capabilities, fat 
content, lipid, protein, carbohydrate, vitamins, minerals, cofactors or other nutritional 
components. 
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It is aJso contemplated that a polynucleotide or polypeptide of the present 
invention may inhibit chemotactic activity. These molecules could also be used to treat 
disorders. Thus, a polynucleotide or polypeptide of the present invention could be used 
as an inhibitor of chemotaxis. 

5 

Binding Activity 

A polypeptide of the present invention may be used to screen for molecules that 
bind to the polypeptide or for molecules to which the polypeptide binds. The binding 
of the polypeptide and the molecule may activate (agonist), increase, inhibit 
10 (antagonist), or decrease activity of the polypeptide or the molecule bound. Examples 
of such molecules include antibodies, oligonucleotides, proteins (e.g., receptors),or 
small molecules. 

, . Preferably, the molecule is closely related to the natural ligand of the 
polypeptide, e.g., a fragment of the ligand, or a natural substrate, a ligand, a structural 

15 or functional mimetic. (See, Coligan et ai., Current Protocols in Immunology 

l(2):Chapter 5 (1991).) Similarly, the molecule can be closely related to the natural 
receptor to which the polypeptide binds, or at least, a fragment of the receptor capable 
of being bound by the polypeptide (e.g., active site). In either case, the molecule can 
be rationally designed using known techniques. 

20 Preferably, the screening for these molecules involves producing appropriate 

cells which express the polypeptide, either as a secreted protein or on the cell 
membrane. Preferred cells include cells from mammals, yeast, Drosophila, or £. coli. 
Cells expressing the polypeptide (or cell membrane containing the expressed 
polypeptide) are then preferably contacted with a test compound potentially containing 

25 the molecule to observe binding, stimulation, or inhibition of activity of either the 
polypeptide or the molecule. 

The assay may simply test binding of a candidate compound to the polypeptide, 
wherein binding is detected by a label, or in an assay involving competition with a 
labeled competitor. Further, the assay may test whether the candidate compound results 

30 in a signal generated by binding to the polypeptide. 

Alternatively, the assay can be carried out using cell-free preparations, 
polypeptide/molecule affixed to a solid support, chemical libraries, or natural product 
mixtures. The assay may also simply comprise the steps of mixing a candidate 
compound with a solution containing a polypeptide, measuring polypeptide/molecule 

35 activity or binding, and comparing the polypeptide/molecule activity or binding to a 
standard. 
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or cardiac), vascular (including vascular endothelium), nervous, hematopoietic, and 
skeletal (bone, cartilage, tendon, and ligament) tissue. Preferably, regeneration occurs 
without or decreased scarring. Regeneration also may include angiogenesis. 

Moreover, a polynucleotide or polypeptide of the present invention may increase 
5 . regeneration of tissues difficult to heal. For example, increased tendon/ligament 
regeneration would quicken recovery time after damage. A polynucleotide or 
polypeptide of the present invention could also be used prophylactically in an effort to 
avoid damage. Specific diseases that could be treated include of tendinitis, carpal tunnei 
syndrome, and other tendon or ligament defects. A further example of tissue 

10 regeneration of non-heaiing wounds includes pressure ulcers,- ulcers associated with 
vascular insufficiency, surgical, and traumatic wounds. 

Similarly, nerve and brain tissue could also be regenerated by using a 
polynucleotide or polypeptide of the present invention to proliferate and differentiate 
nerve cells. Diseases that could be treated using this method include central and 

15 peripheral nervous system diseases, neuropathies, or mechanical and traumatic 
disorders (e.g., spinal cord disorders, head trauma, cerebrovascular disease, and 
stoke). Specifically, diseases associated with peripheral nerve injuries, peripheral 
neuropathy (e.g., resulting from chemotherapy or other medical therapies), localized 
neuropathies, and central nervous system diseases (e.g., Alzheimer's disease, 

20 Parkinson's disease, Huntington's disease, amyotrophic lateral sclerosis, and Shy- 
Drager syndrome), could all be treated using the polynucleotide or polypeptide of the 
present invention. 

Chemotaxis 

25 x A polynucleotide or polypeptide of the present invention may have chemotaxis 

activity. A chemotaxic molecule attracts or mobilizes ceils (e.g., monocytes, 
fibroblasts, neutrophils, T-cells, mast cells, eosinophils, epithelial and/or endothelial 
cells) to a particular site in the body, such as inflammation, infection, or site of 
hyperproliferation. The mobilized ceils can then fight off and/or heal the particular 

30 trauma or abnormality. 

A polynucleotide or polypeptide of the present invention may increase 
chemotaxic activity of particular cells. These chemotactic molecules can then be used to 
treat inflammation, infection, hyperproliferative disorders, or any immune system 
disorder by increasing the number of cells targeted to a particular location in the body. 

35 For example, chemotaxic molecules can be used to treat wounds and other trauma to 

tissues by attracting immune cells to the injured location. Chemotactic molecules of the 
present invention can also attract fibroblasts, which can be used to treat wounds'. 
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related infections), paronychia, prosthesis-related infections, Reiter s Disease, 
respiratory tract infections, such as Whooping Cough or Empyema, sepsis, Lyme 
Disease, Cat-Scratch Disease, Dysentery, Paratyphoid Fever, food poisoning, 
Typhoid, pneumonia, Gonorrhea, meningitis, Chlamydia, Syphilis, Diphtheria. 
5 Leprosy, Paratuberculosis, Tuberculosis, Lupus, Botulism, gangrene, tetanus, 

impetigo, Rheumatic Fever, Scarlet Fever, sexually transmitted diseases, skin diseases 
(e.g., cellulitis, dermatocycoses), toxemia, urinary-tract infections, wound infections. 
A polypepdde or polynucleotide of the present mvendon can be used to treat or detect 
any of these symptoms or diseases. 

10 Moreover, parasidc agents causing disease or symptoms that can be treated or 

detected by a polynucleotide or polypepdde of the present invention include, but not 
limited to, the following families: Amebiasis, Babesiosis, Coccidiosis, 
Cryptosporidiosis, Dientamoebiasis, Dourine, Ectoparasitic, Giardiasis, Helminthiasis, 
Leishmaniasis, Theileriasis, Toxoplasmosis, Trypanosomiasis, and Trichomonas." 

15 - These parasites can cause a variety of diseases or symptoms, including, but not limited 
to: Scabies, Trombicuiiasis, eye infections, intestinal disease (e.g., dysentery, 
giardiasis), liver disease, lung disease, opportunistic infections (e.g., AIDS related), 
Malaria, pregnancy complications, and toxoplasmosis. A polypeptide or polynucleotide 
of the present invention can be used to treat or detect any of these symptoms or 

20 diseases. 

Preferably, treatment using a polypeptide or polynucleotide of the present 
invention could either be by administering an effective amount of a polypeptide to the 
patient, or by removing cells from the patient, supplying the cells with a polynucleotide 
of the present invendon, and returning the engineered cells to the patient (ex vivo 
25 therapy). Moreover, the polypeptide or polynucleotide of the present invention can be 
• used as an antigen in a vaccine to raise an immune response against infectious disease. 

Regeneration 

A polynueieodde or polypeptide of the present invention can be used to 
30 differendate, proliferate, and attract cells, leading to the regeneration of tissues. (See, 
Science 276:59-87 (1997).) The regeneradon of tissues could be used to repair, 
replace, or protect tissue damaged by congenital defects, trauma (wounds, burns, 
incisions, or ulcers), age, disease (e.g. osteoporosis, osteocarthritis, periodontal 
disease, liver failure), surgery, including cosmetic plasuc surgery, fibrosis, reperfusion 
35 injury, or systemic cytokine damage. 

Tissues that could be regenerated using the present invention include organs 
(e.g., pancreas, liver, intesdne, kidney, skin, endothelium), muscle (smooth, skeletal 
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may be treated. The immune response may be increased by either enhancing an existing 
immune response, or by initiating a new immune response. Alternatively, the 
polypeptide or polynucleotide of the present invention may also directly inhibit the 
infectious agent, without necessarily eliciting an immune response. 
5 Viruses are one example of an infectious agent that can cause disease or 

symptoms that can be treated or detected by a polynucleotide or polypeptide of the 
present invention. Examples of viruses, include, but are not limited to the following 
DNA and RNA viral families: xArbovirus, Adenoviridae, Arenaviridae, Aixerivirus, 
Birnaviridae, Bunyaviridae,.Caliciviridae, Circovindae, Coronaviridae, Flaviviridae, 

10 Hepadnaviridae (Hepatitis), Herpes viridae (such as. Cytomegalovirus, Herpes 
Simplex, Herpes Zoster), Mononegavirus (e.g., Paramyxoviridae, Morbillivirus, 
- Rhabdoviridae), Orthomyxoviridae (e.g.. Influenza), Papovaviridae, Parvoviridae, 
Picornaviridae, Poxviridae (such as Smallpox or Vaccinia), Reoviridae (e.g.. 
Rotavirus), Retroviridae (HTLV-I, HTLV-II, Lentivirus), and Togaviridae (e.g. ? 

15 Rubi virus). Viruses falling within these families can cause a variety of diseases or 
symptoms, including, but not limited to: arthritis, bronchioliitis, encephalitis, eye 
infections (e.g., conjunctivitis, keratitis), chronic fatigue syndrome, hepatitis (A, B, C, 
E, Chronic Active, Delta), meningitis, opportunistic infections (e.g., AIDS), 
pneumonia, Burkitt's Lymphoma, chickenpox , hemorrhagic fever, Measles, Mumps, 

20 Parainfluenza, Rabies, the common cold, Polio, leukemia, Rubella, sexually 

transmitted diseases, skin diseases (e.g.; Kaposi's, wans), and viremia. A polypeptide 
or polynucleotide of the present invention can be used to treat or detect any of these 
symptoms or diseases. 

Similarly, bacterial or fungal agents that can cause disease or symptoms and that 

25 can be treated or detected by a polynucleotide or polypeptide of the present invention 
include, but not limited to, the following Gram-Negative and Gram-positive bacterial 
families and fungi: Actinomycetales (e.g., Corynebacterium, Mycobacterium, 
Norcardia), Aspergillosis, Bacillaceae (e.g.. Anthrax, Clostridium), Bacteroidaceae, 
Blastomycosis, Bordeteila, Bonrelia, Brucellosis, Candidiasis, Campylobacter, 

30 Coccidioidomycosis, Cryptococcosis, Dermatocycoses, Enterobacteriaceae (Klebsiella, 
Salmonella, Serratia, Yersinia), Erysipelothrix, Helicobacter, Legionellosis, 
Leptospirosis, Listeria, Mycoplasmatales, Neisseriaceae (e.g., Acinetobacter, 
Gonorrhea,. Menigococcal), Pasteurellacea Infections (e.g., Actinobacilius, 
Heamophiius, Pasteurella), Pseudomonas, Rickettsiaceae, Chlamydiaceae, Syphilis, 

35 and Staphylococcal. These bacterial or fungal families can cause the following diseases 
or symptoms, including, but not limited to: bacteremia, endocarditis, eye infections 
(conjunctivitis, tuberculosis, uveitis), gingivitis, opportunistic infections (e.g.," AIDS 
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shock, sepsis, or systemic inflammatory response syndrome (SIRS)), ischemia- 
reperfusion injury, endotoxin lethality, arthritis, complement-mediated hyperacute 
rejection, nephritis, cytokine or chemokine induced lung injury, inflammatory bowel 
disease. Crohn's disease, or resulting from over production of cytokines (e.e. ? TNF or 
5 IL-1.) 

Hvper proliferative Disorders 

A polypeptide or polynucleotide can be used to treat or detect hyperproliferative 
disorders, including neoplasms. A polypeptide or polynucleotide of the present 

10 invention may inhibit the proliferation of the disorder through direct or indirect 

interactions. Alternatively, a polypeptide or polynucleotide of the present invention - 
may proliferate other cells which can inhibit the hyperproliferative disorder. 

For example, by increasing an immune response, particularly increasing 
antigenic qualities of the hyperproliferative disorder or by proliferating, differentiating, 

15 or mobilizing T-cells, hyperproliferative disorders can be. treated. This immune 

response may be increased by either enhancing an existing immune response, or by 
initiating a new immune response. Alternatively, decreasing an immune response may 
also be a method of treating hyperproliferative disorders, such as a chemotherapeutic 
agent. 

20 Examples of hyperproliferative disorders that can be treated or detected by a 

polynucleotide or polypeptide of the present invention include, but are not limited to 
neoplasms located in the: abdomen, bone, breast; digestive system, liver, pancreas, 
peritoneum, endocrine glands (adrenal, parathyroid, pituitary, testicles, ovary, thymus, 
thyroid), eye, head and neck, nervous (central and peripheral), lymphatic system, 

25 pelvic, skin, soft tissue, spleen, thoracic, and urogenital. 

Similarly, other hyperproliferative disorders can also be treated or detected by a 
polynucleotide or polypeptide of the present invention. Examples of such 
hyperproliferative disorders include, but are not limited to: hypergammaglobulinemia, 
lymphoproliferative disorders, paraproteinemias, purpura, sarcoidosis, Sezary 

30 Syndrome, Waldenstron's Macroglobulinemia, Gaucher T s Disease, histiocytosis, and 
any other hyperproliferative disease, besides neoplasia, located in an organ system 
listed above. 

Infectious Disease 

35 A polypeptide or polynucleotide of the present invention can be used to treat or 

detect infectious agents. For example, by increasing the immune response, particularly 
increasing the proliferation and differentiation of B and/or T cells, infectious diseases* 
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decrease hemostatic or thrombolytic activity could be used to inhibit or dissolve 
clocung. These molecules could be important in the treatment of heart attacks 
(infarction), strokes, or scarring. 

A polynucleotide or polypeptide of the present invention may also be useful in 
5 treating or detecting autoimmune disorders. Many autoimmune disorders result from 
inappropriate recognition of self as foreign material by immune cells. This 
inappropriate recognition results in an immune response leading to the destruction of the 
host tissue. Therefore, the administration of a polypeptide or polynucleotide of the 
present invention that inhibits an immune response, particularly the proliferation. 

10 differentiation, or chemotaxis of T-cells, may be an effective therapy in preventing 
autoimmune disorders. 

Examples of autoimmune disorders that can be treated or detected by the present 
invention include, but are not limited to: Addison's Disease, hemolytic anemia, 
antiphospholipid syndrome, rheumatoid arthritis, dermatitis, allergic encephalomyelitis, 

15 glomerulonephritis, Goodpasture's Syndrome, Graves' Disease, Multiple Sclerosis, 
Myasthenia Gravis, Neuritis, Ophthalmia. Bullous Pemphigoid, Pemphigus, 
Polyendocrinopathies, Purpura, Reiser's Disease, Stiff-Man Syndrome, Autoimmune 
Thyroiditis, Systemic Lupus Erythematosus, Autoimmune Pulmonary Inflammation, 
Guillam-Barre Syndrome, insulin dependent diabetes meilitis, and autoimmune 

20 inflammatory eye disease. 

Similarly, allergic reactions and conditions, such as asthma (particularly allergic 
asthma) or other respiratory problems, may also be treated by a polypepdde or 
polynucleotide of the present invention. Moreover, these molecules can be used to treat 
anaphylaxis, hypersensitivity to an antigenic molecule,. or blood group incompatibility. 

25 A polynucleotide or polypeptide of the present invention may also -be used to 

treat and/or prevent organ rejection or graft-versus-host disease (GVHD). Organ 
rejection occurs by host immune. cell destruction of the transplanted tissue through an 
immune response. Similarly, an immune response is also involved in GVHD, but, in 
this case, the foreign transplanted immune cells destroy the host tissues. The 

30 administration of a polypeptide or polynucleotide of the present invention that inhibits 
an immune response,- particularly the proliferation, differentiation, or chemotaxis of T- 
cells, may be an effective therapy in preventing organ rejection or GVHD. 

Similarly, a polypeptide or polynucleotide of the present invention may also be 
used to modulate inflammation. For example, the polypeptide or polynucleotide may 

35 inhibit the proliferation and differentiation of cells involved in an inflammatory 

response. These molecules can be used to treat inflammatory conditions, both chronic 
and acute conditions, including inflammation associated with infection (e.g., septic 
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Biological Activities 

The polynucleotides and polypeptides of the present invention can be used* in 
assays to test for one or more biological activities. If these polynucleotides and 
polypeptides do exhibit activity in a particular assay, it is likely that these molecules 
may be involved in the diseases associated with the biological activity. Thus, the 
polynucleotides and polypeptides could be used to treat the associated disease. 

Immune Activity 

A polypeptide or polynucleotide of the present invention may be useful in 
treating deficiencies or disorders of the immune system, by activating or inhibiting the 
proliferation, differentiation, or mobilization (chemotaxis) of immune cells. Immune 
cells develop through a process called hematopoiesis, producing myeloid (platelets, red 
blood cells, neutrophils, and macrophages) and lymphoid (B and T lymphocytes) cells 
from pluripotent stem cells. The etiology of these immune deficiencies or disorders 
may be genetic,, somatic, such as cancer or some autoimmune disorders, acquired (e.g., 
by chemotherapy or toxins), or infectious. Moreover, a polynucleotide or polypeptide 
of the present invention can be used as a marker or detector of a particular immune 
system disease or disorder. 

A polynucleotide or polypeptide of the present invention may be useful in 
treating or detecting deficiencies or disorders of hematopoietic cells. A polypeptide or 
polynucleotide of the present invention could be used to increase differentiation and 
proliferation of hematopoietic cells, including the pluripotent stem cells, in an effort to 
treat those disorders associated with a decrease in certain (or many) types hematopoietic 
cells. Examples of immunologic deficiency syndromes include, but are not limited to: 
blood protein disorders (e.g. agammaglobulinemia, dysgammaglobulinemia), ataxia 
telangiectasia, common variable immunodeficiency, Digeorge Syndrome, HIV 
infection, HTLV-BLV infection, leukocyte adhesion deficiency syndrome, 
lymphopenia, phagocyte bactericidal dysfunction, severe combined immunodeficiency 
(SCIDs), Wiskott-Aldrich Disorder, anemia, thrombocytopenia, or hemoglobinuria. 

Moreover, a polypeptide or polynucleotide of the present invention could also 
be used to modulate hemostatic (the stopping of bleeding) or thrombolytic activity (clot 
formation). For example, by increasing -hemostatic or thrombolytic activity, a 
polynucleotide or polypeptide of the present invention could be used to treat blood 
coagulation disorders (e.g., afibrinogenemia, factor deficiencies), blood platelet 
disorders (e.g. thrombocytopenia), or wounds resulting from trauma, surgery, or other 
causes. Alternatively, a polynucleotide or polypeptide of the present invention that can 
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resonance, is introduced (for example, parenterally. subcutaneously, or 
intraperitoneally) into the mammal. It will be understood in the an that the size of the. 
subject and the imaging system used will determine the quantity of imaging moiety 
needed to produce diagnostic images. In the case of a radioisotope moiety, for a human 
5 subject, the quantity of radioactivity injected will normally range from about 5 to 20 
miliicuries of 99mTc. The labeled antibody or antibody fragment will then 
preferentially accumulate at the location of cells which contain the specific protein. In 
vivo tumor imaging is described in S.W. Burchiel et aL, "Immunopharmacokinetics of 
Radiolabeled Antibodies and Their Fragments." (Chapter 13 in Tumor Imaging: The 
10 Radiochemical Detection of Cancer, S.W. Burchiel and B. A. Rhodes, eds., Masson 
Publishing Inc. (1982).) - 

Thus, the invention provides a diagnostic method of a disorder, which involves 
(a) assaying the expression of a polypeptide of the present invention in cells or body 
fluid of an individual; (b) comparing, the level of gene expression with a standard gene 
15 expression level, whereby an increase or decrease in the assayed polypeptide gene 

expression level compared to the standard expression level is indicative of a disorder. 

Moreover, polypeptides of the present' invention can be used to treat disease. 
For example, patients can be administered a polypeptide of the present invention in an 
effort to replace absent or decreased levels of the polypeptide (e.g., insulin), to 
20 supplement absent or decreased levels of a different polypeptide (e.g., hemoglobin S 
for hemoglobin B), to inhibit the activity of a polypeptide (e.g., an oncogene), to 
activate the activity of a polypeptide (e.g., by binding to a receptor), to r.educe the 
activity of a membrane bound receptor by competing with it for free ligand (e.g., 
soluble TNF receptors used in reducing inflammation), or to bring about a desired 
25 response (e.g., blood vessel growth). 

Similarly, antibodies directed to a polypeptide of the present invention can also 
be used to- treat disease. For example, administradon of an antibody directed to a 
polypeptide of the present invention can bind and reduce overproduction of the 
polypeptide. Similarly, administration of an antibody can activate the polypeptide, such 
30 as by binding to a polypeptide bound to a membrane (receptor). 

At the very least, the polypeptides of the present invention can be used as 
molecular weight markers on SDS-PAGE gels or on molecular sieve gel filtration 
columns using methods well known to those of skill in the art. Polypeptides can also 
be used to raise antibodies, which in turn are used co measure protein expression from a 
35 recombinant cell, as a way of assessing transformation of the host cell. Moreover, the 
polypeptides of the present invention can be used to test the following biological 
activities. 
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unknown origin. Appropriate reagents can comprise, for example, DNA probes or 
primers specific co particular tissue prepared from the sequences of the present 
invention. Panels of such reagents can identify tissue by species and/or by organ type. 
In a similar fashion, these reagents can be used to screen tissue cultures for 
contamination. 

In the very least, the polynucleotides of the present invention can be used as 
molecular weight markers on Southern gels, as diagnostic probes for the presence of a 
specific mRNA in a particular cell type, as a probe to "subtract-out" known sequences 
in the process of discovering novel polynucleotides, for selecting and making oligomers 
for attachment to a "gene chip" or other support, to raise anti-DNA antibodies using 
DNA immunization techniques, and as an- antigen to elicit an immune response. 

Uses of the Polypeptides 

Each of the polypeptides identified herein can be used in numerous ways. The 
following description should be considered exemplary and utilizes known techniques. 

A polypeptide of the present invention can be used to assay protein levels in a 
biological sample using antibody-based techniques. For example, protein expression in 
tissues can be studied with classical immunohistological methods. (Jalkanen, M., et 
al., J. Cell. Biol. 101:976-935 (1985); Jalkanen, M., et al., J. Cell . Biol. 105:3087- 
3096 (1987).) Other antibody-based methods useful for detecting protein gene 
expression include immunoassays, such as the enzyme linked immunosorbent assay 
(ELLS A) and the radioimmunoassay (RIA). Suitable antibody assay labels are known 
in the art and include enzyme labels, such as, glucose oxidase, and radioisotopes, such 
as iodine (1251, 1211), carbon (14C), sulfur (35S), tritium (3H), indium- ( 1 12In), and 
technetium (99mTc), and fluorescent labels, such as fluorescein and rhodamine, and 
biotin. 

In addition to assaying secreted protein levels in a biological sample, proteins 
can also be detected in vivo by imaging. Antibody labels or markers for in vivo 
imaging of protein include those detectable by X-radiography, NMR or ESR. For X- 
radiography, suitable labels include radioisotopes such as barium or cesium, which emit 
detectable radiation but are not overtly harmful to the subject. Suitable markers for 
NMR and ESR include those with a detectable characteristic spin, such as deuterium, 
which may be incorporated into the antibody by labeling of nutrients for the relevant 
hybridoma. 

A protein-specific antibody or antibody fragment which has been labeled with 
an appropriate detectable imaging moiety, such as a radioisotope (for example, 1311, 
1 12In, 99mTc), a radio-opaque substance, or a material detectable by nuclear magnetic 
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systems, and the information disclosed herein can be used to design amisense. or triple 
helix polynucleotides in an effort to treat disease. 

Polynucleotides of the present invention are also useful in gene therapy. One 
goal of gene therapy is to insert a normal gene into an organism having a defective 

5 gene, in an effort to correct the genetic defect. The polynucleotides disclosed in the 
present invention offer a means of targeting such genetic defects in a- highly accurate 
manner. Another goal is to insert a new gene that was not present in the. host genome, 
thereby producing a new trait in the host cell. 

The polynucleotides are also useful for identifying individuals from minute 

10 biological samples. The United States military, for example, is considering the use of 
restriction fragment length polymorphism (RFLP) for identification, of its personnel". " In 
this technique, an individual's genomic DNA is digested with one or more restriction 
enzymes, and probed on a Southern blot to yield unique bands for identifying 
personnel. This method does not suffer from the current limitations of "Dog Tags" 

15 which can be lost, switched, or stolen, making positive identification difficult. The 
polynucleotides of the present invention can be used as additional DNA markers for 
RFLP. 

The polynucleotides of the present invention can also be used as an alternative to 
RFLP, by determining the actual base-by-base DNA sequence of selected portions of an 

20 individual's genome. These sequences can be used to prepare PCR primers for 

amplifying and isolating such selected DNA, which can then be sequenced. Using this 
technique, individuals can be identified because each individual will have a unique set 
of DNA sequences. Once an unique ID database is established for an individual, 
positive identification of that individual, living or dead, can be made from extremely 

25 small tissue samples. 

Forensic biology also benefits from using DNA-based identification techniques 
as disclosed herein. DNA sequences taken from very small biological samples such as 
tissues, e.g., hair or skin, or body fluids, e.g„ blood, saliva, semen, etc., can be 
amplified using PCR. In one prior art technique, gene sequences amplified from 

30 polymorphic loci, such as DQa class II HLA gene, are used in forensic biology to 

identify individuals. (Erlich, H. ? PCR Technology, Freeman and Co. (1992).) Once 
these specific polymorphic loci are amplified, they are digested with one or more 
restriction enzymes, yielding an identifying set of bands on a Southern-blot probed with 
DNA corresponding to the DQa class II HLA gene. Similarly, polynucleotides of the 
35 present invention can be used as polymorphic markers for forensic purposes. 

There, is also a need for reagents capable of identifying the source of a particular 
tissue. Such need arises, for example, in forensics when presented with tissue of 
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more likely conserved within gene families, thus increasing the chance of cross 
hybridization during chromosomal mapping. 

Once a polynucleotide has been mapped to a precise chromosomal location, the 
physical position of the polynucleotide can be used in linkage analysis. Linkage 
5 analysis establishes coinheritance between a chromosomal location and presentation of a 
particular disease. (Disease mapping data are found, for example, in V. McKusick, 
Mendelian Inheritance in Man (available on line through Johns Hopkins University 
Welch Medical Library) .) Assuming 1 meg abase mapping resolution and one gene per 
20 kb, a cDN A precisely localized to a chromosomal region associated with the disease 

10 could be one of 50-500 potential causative genes. 

Thus, once coinheritance is established, differences in the polynucleotide and 
the corresponding gene- between affected and unaffected individuals can be examined. 
First, visible structural alterations in the chromosomes, such as deletions or 
translocations, are examined in chromosome spreads or by PCR. If no structural 

15 alterations exist, the presence of point mutations are ascertained. Mutations observed in 
some or all affected individuals, but not in normal individuals, indicates that the 
mutation may cause the disease. However, complete sequencing of the polypeptide and 
the corresponding gene from several normal individuals is required to distinguish the 
mutation from a polymorphism. If a new polymorphism is identified, this polymorphic 

20 polypeptide can be used for further linkage analysis. 

Furthermore, increased or decreased expression of the gene in affected 
individuals as compared to unaffected individuals can be assessed using 
polynucleotides of the present invention. Any of these alterations (altered expression, 
chromosomal rearrangement, or mutation) can be used as a diagnostic or prognostic 

25 marker. 

In addition to the foregoing, a polynucleotide can be used to control gene 
expression through triple helix formation or antisense DN A or RN A. Both methods 
rely on binding of the polynucleotide to DNA or RNA. For these techniques, preferred 
polynucleotides are usually 20 to 40 bases in length and complementary to either the 

30 region of the gene involved in transcription (triple helix - see Lee et al., Nucl. Acids 

Res. 6:3073 (1979); Cooney et al., Science 241:456 (1988); and Dervan et al., Science 
251:1360 (1991) ) or to the mRNA itself (antisense - Okano, J. Neurochem. 56:560 
(1991); Oligodeoxy-nucleottdes as Antisense Inhibitors of Gene Expression, CRC 
Press, Boca Raton, FL (1988).) Triple helix formation optimally results in a shut-off 

35 of RNA transcription from DNA, while antisense RNA hybridization blocks translation . 
of an mRNA molecule into polypeptide. Both techniques are effective in model 
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after translation in all eukaryotic cells. While the N- terminal methionine on most- 
proteins aiso is efficiently removed in most prokaryotes, for some proteins, this 
prokaryotic removal process is inefficient, depending on the nature of the amino acid to 
which the N-terminal methionine is-covalently linked. 

5 

Uses of the Polynucleotides 

Each of the polynucleotides identified herein can be used in numerous ways as 
reagents. The following description should be considered exemplary and utilizes 
known techniques. 

10 The polynucleotides of the present invention are useful for chromosome 

identification. There exists an ongoing need to identify new chromosome markers, 
since few chromosome marking reagents, based on actual sequence data. (repeat 
polymorphisms), are presently available. Each polynucleotide of the present invention 
can be used as a chromosome marker. 

15 Briefly, sequences can be mapped to chromosomes by preparing PCR primers 

(preferably 15-25 bp) from the sequences shown in SEQ ID NO:X. Primers can be 
selected using computer analysis so. that primers do not span more than one predicted 
exon in the genomic DNA. These primers are then used for PCR screening of somatic 
cell hybrids containing individual human chromosomes. Only those hybrids containing 

20 the human gene corresponding to the SEQ ID NO:X will yield an amplified fragment. 
Similarly, somatic hybrids provide a rapid method of PCR mapping the 
polynucleotides to particular chromosomes. Three or more clones can be assigned per 
day using a single thermal cycler. Moreover, sublocalization of the polynucleotides can 
be achieved with panels of specific chromosome fragments. Other gene mapping 

25 strategies that can be used include in situ hybridization, prescreening with labeled flow- 
sorted chromosomes, and preselection by hybridization to construct chromosome 
specific -c DNA libraries. 

Precise chromosomal location of the polynucleotides can also be achieved using 
fluorescence in situ hybridization (FISH) of a metaphase chromosomal spread. This 

30 technique uses polynucleotides as short as 500 or 600 bases; however, polynucleotides 
2,000-4,000 bp are preferred. For a review of this technique, see Verma et al., 
"Human Chromosomes: a Manual of Basic Techniques," Pergamon Press, New York 
(1988). 

For chromosome mapping, the polynucieoddes can be used individually (to 
35 mark a single chromosome or a single site on that chromosome) or in panels (for 
marking multiple sites and/or multiple chromosomes). Preferred polynucleotides 
correspond to the noncoding regions of the cDNAs because the coding sequences are 



WO 98/54963 



PCT/US98/11422 



203 

genes for culturingin E. coli and other bacteria. Representative examples of 
appropriate hosts include, but are not limited to, bacterial cells, such as E. coli. 
Strepcomyces and Salmonella typhimurium cells; fungal cells, such as yeast cells; insect 
cells such as Drosophila S2 and Spodoptera Sf9 cells; animal cells such as CHO. COS, 
5 293, and Bowes melanoma ceils; and plant cells. Appropriate culture mediums and 
conditions for the above-described host cells are known in the an. 

Among vectors preferred for use in bacteria include pQE70, pQE60 and pQE-9, 
available from QIAGEN, Inc.; pBluescript vectors, Phagescript vectors, pNHSA. 
pNHl6a, piNHlSA, pNH46A, available from Stratagene Cloning Systems, Inc.; and 

10 ptrc99a, pKK223-3, pKK23;3-3, pDR540, pRIT5 available from Pharmacia Biotech, 
Inc. Among preferred eukaryotic vectors are pWLNEO, pSV2CAT*pOG44, pXTl' 
. and pSG available from Stratagene; and pSVK3, pBPV, pMSG and pSVL available 
from Pharmacia. Other suitable vectors will be readily apparent to the skilled anisan. 
Introduction of the construct into the host cell can be effected by calcium 

15 phosphate transfection, DEAE-dextran mediated transfection, cationic lipid-mediated 
transfection, electroporation, transduction, infection, or other methods. Such methods 
are described in many standard laboratory manuals, such as Davis et al., Basic Methods 
In Molecular Biology (1986). It is specifically contemplated that the polypeptides of the 
present invention may in fact be .expressed by a host cell lacking a recombinant vector. 

20 A polypeptide of this invention can be recovered and purified from recombinant 

cell cultures by well-known methods including ammonium sulfate or ethanol 
precipitation, acid extraction, anion or cation exchange chromatography, 
phosphocellulose chromatography, hydrophobic interaction chromatography, afflnitv 
chromatography, hydroxylapatite chromatography and lectin chromatography. Most 

25 preferably, high performance liquid chromatography ("HPLC") is employed for 
purification. 

Polypeptides of the present invention, and preferably the secreted form, can also 
be recovered from: products purified from natural sources, including bodily * fluids, 
tissues and cells, whether directly isolated or cultured; products of chemical synthetic 

30 procedures; and products produced by recombinant techniques from a prokaryotic or 
eukaryotic host, including, for example, bacterial, yeast, higher plant, insect, and 
mammalian cells. Depending upon the host employed in a recombinant production 
procedure, the polypeptides of the present invention may be glycosylated or may be 
non-glycosylated. In addition, polypeptides of the invention may also include an initial 

35 modified methionine residue, in some cases as a result of host-mediated processes. 
Thus, it is well known in the an that the N-terminal methionine encoded by the 
translation initiation codon generally is removed with high efficiency from any protein 
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Bennett ec al. 7 J. Molecular Recognition 8:52-53 (1995); K. Johanson et al. 7 J. Biol. 

Chem. 270:9459-9471 (1995).). 

Moreover, the polypeptides of the present invencion can be fused to marker 

sequences, such as a peptide which facilitates purification of the fused polypeptide. In 
5 preferred embodiments, the marker amino acid sequence is a hexa-histidine peptide, 

such as the tag provided in a pQE vector (QIAGEN,. Inc., 9259 Eton Avenue, 

Chatsworth, CA, 91311), among others, many of which are commercially available. 

As described in Gentz et al., Proc. Natl. Acad. Sci. USA 86:821-824 (1989), for 

instance, hexa-histidine provides for convenient purification of the fusion protein. 
10 .Another peptide tag useful for purification, the "HA" tag, corresponds to an epitope 

derived from the influenza hemagglutinin protein. (Wilson et al., Cell 37:767 (1984).) 
Thus, any of these above fusions can be engineered using the polynucleotides 

or the polypeptides of the present invention. 

15 Vectors, Host Cells, and Protein Production 

The present invention also relates to vectors containing the polynucleotide of the 
present invention, host cells, and the production of polypeptides by recombinant - 
techniques. The vector may be, for example, a phage, plasmid, viral, or retroviral 
vector. Retroviral vectors may be replication competent or replication defective. In the 

20 latter case, viral propagation generally will occur only in complementing host cells. 

The polynucleotides may be joined to a vector containing a selectable marker for 
propagation in a host. Generally, a plasmid vector is introduced in a precipitate, such , 
as a calcium phosphate precipitate, or in a complex with a charged lipid. If the vector is 
a virus, it may be packaged in vitro using an appropriate packaging cell line and then 

25 transduced into host cells. 

The polynucleotide insert should be operatively linked to an appropriate 
promoter, such as the phage lambda PL promoter, the E. coli lac, trp, pho*4> and tac 
promoters, the SV40 early and late promoters and promoters of retroviral LTRs, to 
name a few. Other suitable promoters will be known to the skilled artisan. The 

30 expression constructs will further contain sites for transcription initiation, termination, 
and, in the transcribed region, a ribosome binding site for translation. The coding 
portion of the transcripts expressed by the constructs will preferably include a 
translation initiating codon at the beginning and a termination codon (UAA, UGA or 
UAG) appropriately positioned at the end of the polypeptide to be translated. 

35 As indicated, the expression vectors will preferably include at least one 

selectable marker. Such markers include dihydrofolate reductase, G418 or neomycin 
resistance for eukaryotic cell culture and tetracycline, kanamycin or ampicillin resistance 
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polypeptide of the present invention can be used to Indirectly detect the second protein 
by binding to the polypeptide. Moreover, because secreted proteins target cellular 
locations based on trafficking signals, the polypeptides of the present invention can be 
used as targeting molecules once fused to other proteins. 
5 Examples of domains that can be fused to polypeptides of the present invention 

include noc only heterologous signal sequences, but also other heterologous functional 
regions. The fusion does not necessarily need to be direct, but may occur through 
linker sequences. 

Moreover, fusion proteins may also be engineered to improve characteristics of 
10 the polypeptide of the present. invention. For instance, a region of additional amino 
acids, particularly charged amino acids, may be added to the N-terminus of the 
polypeptide to improve stability and persistence during purification from the host cell or 
subsequent handling and storage. Also, peptide moieties may be added to the 
polypeptide to facilitate purification. Such regions may be removed prior to final 
15 preparation of the* polypeptide. The addition of peptide moieties to facilitate handling of 
polypeptides are familiar and routine techniques in the art. 

Moreover, polypeptides of the present invention, including fragments, and 
specifically epitopes, can be combined with pans of the constant domain of 
immunoglobulins (IgG), resulting in chimeric polypeptides. These fusion proteins 
20 . facilitate purification and show an increased half-life in vivo. One reported example 
describes chimeric proteins consisting of the first two domains of the human CD4- 
polypeptide and various domains of the constant regions of the heavy or light chains of 
mammalian immunoglobulins. (EP A 394,827; Traunecker et ah, Nature 331:84-86 
(1988).) Fusion proteins having disulfide-linked dimenc structures (due to the IgG) 
25 can also be more efficient in binding and neutralizing other molecules, than the 
monomeric secreted protein or protein fragment alone. (Fountoulakis et al., J. 
Biochem. 270:3958-3964 (1995).) 

Similarly, EP-A-0 464 533 (Canadian counterpart 2045869) discloses fusion, 
proteins comprising various portions of constant region of immunoglobulin molecules 
30 ' together with another human protein or pan thereof. In many cases, the Fc part in a 
fusion protein is beneficial in therapy and diagnosis, and thus can result in, for 
example, improved pharmacokinetic properties. (EP-A 0232 262.) Alternatively, 
deleting the Fc pan after the fusion protein has been expressed, detected, and purified, 
would be desired. For example, the Fc portion may hinder therapy and diagnosis if the 
35 fusion protein is used as an antigen for immunizations. In drug discovery, for 

example, human proteins, such as hIL-5, have been fused with Fc portions for the 
purpose of high-throughput screening assays to identify antagonists of ML- 5. '(See, D. 
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epitope, as well as the polynucleotide encoding this fragment. A region of a protein 
molecule to which an antibody can bind is defined as an "antigenic epitope." In 
•contrast, an "immunogenic epitope" is defined as a pan of a protein that elicits an 
antibody response. (See, for instance, Geysen et al. ? Proc. Natl. Acad. Sci. USA 
5 81:3998- 4002 (1983).) 

Fragments which function as epitopes may be produced by any conventional 
means. (See, e.g., Houghten, R. A., Proc. Natl. Acad. Sci. USA 82:5131-5135 
(1985) further described in U.S. Patent No, 4,631,21 1.) ' 

In the present invention, antigenic epitopes preferably contain a sequence of at 

10 least seven, more preferably at least nine, and most preferably between about" 15 to 
about 30 amino acids. Andgenic epitopes are useful to raise antibodies, including 
monoclonal antibodies, that specifically bind the epitope. (See, for instance, Wilson et 
al., Cell 37:767-778 (1984); Sutcliffe, J. G. et ah. Science 2 19:660-666 (1983).) 

Similarly, immunogenic epitopes can be used to induce antibodies according to 

15 methods well known in the an. (See, for instance, Sutcliffe et al., supra; Wilson et al., 
supra: Chow, M. et al., Proc. Natl. Acad. Sci. USA 82:910-914; and Bittle, F. J. et 
al., J. Gen. Virol. 66:2347-2354 (1985).) A preferred immunogenic epitope includes 
the secreted protein. The immunogenic epitopes may be presented together with a. 
carrier protein, such as an albumin, to an animal system (such as rabbit or mouse) or, if 

20 it is long enough (at least about 25 amino acids), without a carrier. However, 

immunogenic epitopes comprising as few as 8 to 10 amino acids have been shown to be 
sufficient to raise antibodies capable of binding to, at the very least, linear epitopes in a 
denatured polypeptide (e.g., in Western blotting.) 

As used herein, the term "antibody" (Ab) or "monoclonal antibody" (Mab) is 

25 meant to include intact molecules as well as antibody fragments (such as, for example, 
Fab and F(ab')2 fragments) which are capable of specifically binding to protein. Fab 
and F(ab')2 fragments lack the Fc fragment of intact antibody, clear more rapidly from 
the circulation, and may have less non-specific tissue binding than an intact antibody. 
(Wahl et al., J. Nucl. Med. 24:316-325 (1983).) Thus, these fragments are preferred, 

30 as well as the products of a FAB or other immunoglobulin expression library. 

Moreover, andbodies of the present invention include chimeric, single chain, and 
humanized antibodies. 

Fusion Proteins 

35 Any polypeptide of the present invention can be used to generate fusion 

proteins. For example, the polypeptide of the present invention, when fused to a 
second protein, can be used as an antigenic tag. Antibodies raised against the " 
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carboxy terminus, or both. For example, any number of amino acids, ranging from 1- 
60, can be deleted from the amino terminus of either the secreted polypeptide or the 
mature form. Similarly, any number of amino acids, ranging from 1-30, can be deleted 
from the carboxy terminus of the secreted protein or mature form. Furthermore, any 
5 combination of the above amino and carboxy terminus deletions are preferred. 

Similarly, polynucleotide .fragments encoding these polypeptide fragments are also 
preferred. 

Particularly, N-terminal deletions of the polypeptide of the present invention can 
be described by the general formula m-p, where p is the total number of amino acids in 
10 the polypeptide and m is an integer from 2 to (p-1), and where both of these integers (m 
& p) correspond to the position of the amino acid residue identified in SEQ ID NO: Y. 

Moreover, C-terminal deletions of the polypeptide of the present invention can 
also be described by the general formula 1-n, where n is an integer from 2 to (p-1), and 
again where these integers (n & p) correspond to the position of the amino acid residue 
1 5 identified in SEQ ID NO: Y. 

^ The invention also provides polypeptides having one or more amino acids 
deleted from both the amino and the carboxy! termini, which may be described 
generally as having residues m-n of SEQ ID NO:Y, where m and n are integers as 
described above. 

20 . Also preferred are polypeptide and polynucleotide fragments characterized by 

structural or functional domains, such as fragments that comprise alpha-helix and alpha- 
helix forming regions, beta-sheet and beta-sheet-forming regions, turn and turn- 
forming regions, coil and coil-forming regions, hydrophilic regions, hydrophobic 
regions, alpha amphipathic regions, beta amphipathic regions, flexible regions, surface- 

25 forming regions, substrate binding region, and high antigenic index regions. 
Polypeptide fragments of SEQ ID NO: Y falling within conserved domains are 
specifically contemplated by the present invention. Moreover, polynucleotide 
fragments encoding these domains are also contemplated. 

Other preferred fragments are biologically active fragments. Biologically active 

30 fragments are those exhibiting activity similar, but not necessarily identical, to an 
activity of the polypeptide of the present invention. The biological activity of the 
fragments may include an improved desired activity, or a decreased undesirable acuvity. 

Epitopes & Antibodies 

35 In the present invention, "epitopes" refer to polypeptide fragments having 

antigenic or immunogenic activity in an animal, especially in a human. A preferred 
embodiment of the present invention relates to a polypeptide fragment comprising an - 
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Polynucleotide and Polypeptide Fragments 

In the present invention, a "polynucleotide fragment" refers to a short 
polynucleotide having a nucleic acid sequence contained- in the deposited clone or 
shown in SEQ ID NO.X. The short nucleotide fragments are preferably at least about 
5 15 nt, and more preferably at least about 20 nt. still more preferably at least about 30 nt ? 
and even more preferably, at least about 40 nt in length. A fragment "at feast 20 nt in 
length," for example, is intended to include 20 or more contiguous bases from the 
cDNA sequence contained in the deposited clone or the nucleotide sequence shown in 
SEQ ID NO:X. These nucleotide fragments are useful as diagnostic probes and primers 

10 as discussed herein. Of course, larger fragments (e.g., 50, 150, 500, 600, 2000 
nucleotides) are preferred. 

Moreover, representative examples of polynucleotide fragments of the 
invention, include, for example, fragments having a sequence from about nucleotide 
number 1-50, 51-100, 101-150, 151-200, 201 -250 r 25 1-300, 301-350, 351-400, 401- 

15 ' 450, 451-500, 501-550, 551-600, 651-700, 701-750, 751-800, 800-850, 851-900, 
901-950, 951-1000, 1001-1050, 1051-1100, 1101-1.150, 1151-1200, 1201-1250, 
1251-1300, 1301-1350, 1351-1400, 1401-1450, 1451-1500, 1501-1550, 1551-1600, 
1601-1650, 1651-1700, 1701-1750, 1751-1800, 1801-1850, 1851-1900, 1901-1950, 
1951-2000, or 2001 to the end of SEQ ID NO.X or the cDNA contained in the 

20 deposited clone. In this context "about" includes the particularly recited ranges, larger 
or smaller by several (5, 4, 3, 2, or 1) nucleotides, at either terminus or at both termini. 
Preferably, these fragments encode a polypeptide which has biological activity. More 
preferably, these polynucleotides can be used as probes or primers as discussed herein. - 
In the present invention, a "polypeptide fragment" refers to a short amino acid 

25 sequence contained in SEQ ID NQ:Y or encoded by the cDNA contained in the 

• deposited clone. Protein fragments may be "free-standing," or comprised within a 
larger polypeptide of which the fragment forms a pan or region, most preferably as- a 
single continuous region. Representative examples of polypeptide fragments of the 
invention, include, for example, fragments from about amino acid number 1-20, 21-40, 

30 41-60, 61-80, 81-100, 102-120, 121-140, 141-160, or 161 to the end of the coding 

region. Moreover, polypeptide fragments can be about 20, 30, 40, 50, 60, 70, 80, 90, 
100, 1 10, 120, 130, 140, or 150 amino acids in length. In this context "about" 
includes the particularly recited ranges, larger or smaller by several (5, 4, 3, 2, or 1) 
amino acids, at either extreme or at both extremes. 
1 35 Preferred polypeptide fragments include the secreted protein as well as the 

mature form. Further preferred polypeptide fragments include the secreted protein or 
the mature form having a continuous series of deleted residues from the amino or the 



WO 98/54963 



PCT/US9S/11422 



197 

The second strategy uses genetic engineering, to introduce amino acid changes ac 
specific positions of a cloned gene to identify regions critical for procein function. For 
example, site directed mutagenesis or alanine -scanning mutagenesis (introduction of 
single alanine mutations at every residue in the molecule) can be used. (Cunningham 
5 and Wells, Science 244:108 1-1085 (1989).) The resulting mutant molecules can then 
be tested for biological activity. 

As the authors state, these two strategies have revealed that proteins are 
surprisingly tolerant of amino acid substitutions. The authors further indicate which 
amino acid changes are likeiy to be permissive at certain amino acid positions in the 
10 protein. For example, most buried (within the tertiary structure of the protein) amino 
acid residues require nonpoiar side chains, whereas few features of surface side chains 
are generally consen/ed. Moreover, tolerated conservative amino acid substitutions 
involve replacement of the aliphatic or hydrophobic amino acids Ala, Vai, Leu and He; 
replacement of the hydroxy I residues Ser and Thr; replacement of the acidic residues 
■ 15 - Asp and Glu; replacement of the amide residues- Asn and Gin, replacement of the- basic 
residues Lys, Arg, and His; replacement of the aromatic residues Phe, Tyr, and Trp, 
and replacement of the small-sized amino acids Ala, Ser, Thr, Met, and Gly. 

Besides conservative amino acid substitution, variants of the present invention- 
include (i) substitutions with one or more of the non-conserved amino acid residues, 
20 where the substituted amino acid residues may or may not be one encoded by the 
genetic code, or (ii) substitution with one or more of amino acid residues having a 
substituent group, or (iii) fusion of the mature polypeptide with another compound, 
such as a compound to increase the stability and/or solubility of the polypeptide (for 
example, polyethylene glycol), or (iv) fusion of the polypeptide with additional amino 
25 acids, such as an IgG Fc fusion region peptide, or leader or secretory sequence, or a 
sequence facilitating purification. Such variant polypeptides are deemed to be within 
the scope of those skilled in the art from the teachings herein. 

For example, polypeptide variants containing amino acid substitutions of 
charged amino acids with other charged or neutral amino acids may produce proteins 
30 with improved characteristics, such as less aggregation. Aggregation of pharmaceutical 
formulations both reduces activity and increases clearance due to the aggregate's 
immunogenic activity. (Pinckard et al., Clin. Exp. Immunol. 2:33 1-340 (1967); 
Robbins et al., Diabetes 36: 838-845 (1987); Cleland et al.-, Crit. Rev. Therapeutic 
Drug Carrier Systems 10:307-377 (1993).) 

35 
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(e.g., Filtron), equilibrated with 40 miM sodium acetate, pH 6.0 is employed. The 
filtered sample is loaded onto a cation exchange resin (e.g., Poros HS-50, Persepdve 
Biosystems). The column is washed with 40 mM sodium acetate, pH 6.0 and eiuted 
with 250 mM, 500 mM, 1000 mM, and 1500 mM NaCl in the same buffer, in a 
5 stepwise manner. The absorbance at 280 nm of the effluent is continuously monitored. 
Fractions are collected and further analyzed by SOS-PAGE. 

Fractions containing the polypeptide are then pooled and mixed -with 4 volumes 
of water. The diluted sample is then loaded onto a previously prepared set of tandem 
columns of strong anion (Poros HQ-50, Persepdve Biosystems) and weak anion 
10 (Poros CM-20, Persepdve Biosystems) exchange resins. "The columns are equilibrated 
with 40 mM sodium acetate, pH 6.0. Both columns are washed with 40 mM sodium 
acetate, pH 6.0, 200 mM NaCL The CM-20 column is then eiuted using a 10 column 
volume linear gradient ranging from 0.2 M NaCl', 50 mM sodium acetate, pH 6.0 to 1.0 
M NaCL 50 mM sodium acetate, pH 6.5. Fractions are collected under constant A 2S0 
15 monitoring of the effluent. Fractions containing the polypeptide (determined, for 
instance, by 16% SDS-PAGE) are then pooled. 

The resultant polypeptide should exhibit greaterthan 95% purity. after the above • 
refolding and purification steps. No major contaminant bands should be observed from 

Commassie blue stained 16% SDS-PAGE gel when 5 j-ig of purified protein is loaded. 

20 The purified protein can also be tested for endotoxin/LPS contamination, and typically 
the LPS content is less than 0.1 ng/ml according to LAL assays. 

Example 7: Cloning and Expression of a Polypeptide in a Baculovirus 
Expression Svstem 

25 In this example, the plasmid shuttle vector pA2 is used to insert a polynucleotide 

into a baculovirus to express a polypeptide. This expression vector contains the strong 
polyhedrin promoter of the Autographa californica nuclear polyhedrosis virus 
(AcMNPV) followed by convenient restriction sites- such as BamHI, Xba I and 
Asp71S. The polyadenylation site of the simian virus 40 ("SV40") is used for efficient 

30 polyadenylation. For easy selection of recombinant virus, the plasmid contains the 

beta-galactosidase gene from E, coli under control of a weak Drosophila promoter in the 
same orientation, followed by the polyadenylation signal of the polyhedrin gene. The 
inserted genes are flanked on both sides by viral sequences for cell-mediated 
homologous recombination with wild-type viral DNA to generate a viable virus that' 

35 express the cloned polynucleotide. 
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Many other baculo virus vectors can be used in place of the vector above, such 
as pAc373, pVL94l, and pAcIML as one skulled in the an would readily appreciate, as 
long as the construct provides appropriately located signals for transcription, 
translation, secretion and the like, including a signal peptide and an in-frame AUG as 
5 required. Such vectors are described, for instance, in Luckow et al., Virology 170:31- 
39-(1989). 

Specifically, the cDNA sequence contained in the deposited clone, including the 
AUG initiation codon and the naturally associated leader sequence identified in Table 1, 
is amplified using the PCR protocol described in Example 1. If the naturally occurring 

10 signal sequencers used to produce the secreted protein, the pA2 vector does not need a 
second signal peptide. Alternatively, the vector can be modified (pA2 GP) to include a 
baculovirus leader sequence, using the standard methods described in Summers et al., 
"A Manual of Methods for Baculovirus Vectors and Insect Cell Culture Procedures," 
Texas Agricultural Experimental Station Bulletin No. 1555 (1987). 

i5 The amplified fragment is isolated from a 1% agarose gel using a commercially 

available kit ("Geneclean," BIO 101 Inc., La Jolla, Ca.). The fragment then is digested 
with appropriate restriction enzymes and again purified on a 1% agarose gel. 

The plasmid is digested with the corresponding restriction enzymes and 
optionally, can be dephosphorylated using calf intestinal phosphatase, using routine 

20 procedures known in the art. The DN A is then isolated from a 1 % agarose gel using a 
commercially available kit ("Geneclean" BIO 101 Inc., La Jolla, Ca.). 

The fragment and the dephosphorylated plasmid are ligated together with T4 
DNA ligase. E. coli HB10I or other suitable E. coli hosts such as XL-1 Blue 
(Stratagene Cloning Systems, La Jolla, CA) cells are transformed with the ligation 

25 mixture and spread on culture plates. Bacteria containing the plasmid are identified by 
digesting DNA from individual colonies and analyzing the digestion product by gel 
electrophoresis. The sequence of the cloned fragment is confirmed by DNA 
sequencing. 

Five jag of a plasmid containing the polynucleotide is co-transfected with 1.0 ug 
30 of a commercially available linearized baculovirus DNA ("BaculoGbld™ baculovirus 
DNA", Pharmingen, San Diego, CA), using the lipofection method described by 
Feigner et al., Proc. Natl. Acad. Sci. USA 84:7413-741-7 (1987). One jig of 
BaculoGoId™ virus DNA and 5 ug of the plasmid are mixed in a sterile well of a 
microtiter plate containing 50 ul of serum-free Grace's medium (Life Technologies 
35 Inc., Gaithersburg, MD). .Afterwards, 10 ul Lipofectin plus 90 ul Grace's medium are 
added, mixed and incubated for 15 minutes at room temperature. Then the transfection 
mixture is added drop-wise to Sf9 insect cells (ATCC CRL 171 1) seeded in a 35 mm 
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tissue culture plate with 1 ml Grace's medium without seaim. The plate is then 
incubated for 5 hours at 27° C. The transfection solution is then removed from the plate 
and 1 ml of Grace's insect medium supplemented with 10% fetal caif serum is added. 
Cultivation is then condnued at 27° C for four days. 
5 After four days the supernatant is collected and a plaque assay is performed, as 

described by Summers and Smith, supra. An agarose gel with "Blue Gal" (Life 
Technologies Inc., Gaithersburg) is used to allow easy identification and isolation of 
gal-expressing clones, which produce blue-stained plaques. (A detailed description of a 
"plaque assay" of this type can also be found in the user's guide for insect cell culture 

10 and baculovirology distributed by Life Technologies Inc., Gaithersburg, page 9- 10.) 
After appropriate incubation, blue stained plaques are picked with the tip of a 
micropipettor (e.g., Eppendorf). The agar containing the recombinant viruses is then 
resuspended in a microcentrifuge tube containing 200 ill' of Grace's medium and the 
suspension containing the recombinant baculovirus is used to infect Sf9 cells seeded in 

15 35 mm dishes. Four days later the supernatants of these culture dishes are harvested 
and then they are stored at 4° C. 

To verify the expression of the polypeptide, Sf9 cells are grown in Grace's 

medium supplemented with 10% heat- inactivated FBS. The cells are' infected with the 

recombinant baculovirus containing the polynucleotide at a multiplicity of infection 

20 ("MOI") of about 2. If radiolabeled proteins are desired, 6 hours later the medium is 
removed and is replaced with SF900 II medium minus methionine and cysteine 
(available from Life Technologies Inc., Rockville, MD). After 42 hours, 5 uCi of 33 S- 
methionine and 5 jaCi j:> S-cysteine (available from Amersham) are added. The cells are 
further incubated for 16 hours and then are harvested by centrifugation. The proteins 

25 in the supernatant as well as the intracellular proteins are analyzed by SDS-PAGE 
followed by autoradiography (if radiolabeled). 

Microsequencing of the amino acid sequence of the amino terminus of purified 
protein may be used to determine the amino terminal sequence of the produced 
protein. 



30 



ExampleS: Expression of a Polypeptide in Mammalian Ceils 

The polypeptide of the present invention can be expressed in a mammalian cell. 
A typical mammalian expression vector contains a promoter element, which mediates 
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the initiation of transcription of mRNA, a protein coding sequence, and signals required 
for the termination of transcription and polyadenylation of the transcript. Additional 
elements include enhancers, Kozak sequences and intervening sequences flanked by 
donor and acceptor sites for RNA splicing. Highly efficient transcription is achieved 
with the early and late promoters from SV40, the long terminal repeats (LTRs) from 
Retroviruses, e.g., RSV r HTLVI, HIV1 and the early promoter of the cytomegalovirus 
(CMV). However, cellular elements can also be used (e.g., the human actin promoter). 

Suitable expression vectors for use in practicing the present invention include, 
for example, vectors such as pSVL and pMSG (Pharmacia, Uppsala, Sweden), 
pRSVcat (ATCC 37152), pSV2dhfr (ATCC 37146), pBC12MI (ATCC 67109), 
pCMVSport 2.0, and pCMVSport 3.0. Mammalian host cells that could be used 
include, human Hela, 293, H9 and Jurkat cells, mouse NIH3T3 and CI 27 cells, Cos 1', 
Cos 7 and CV 1 quail QC1-3 cells, mouse L cells and Chinese hamster ovary (CHO) 
cells. 

Alternatively, the polypeptide can be expressed in stable ceii lines containing the 
polynucleotide integrated into a chromosome. The co-transfection with a selectable 
marker such as dhfr, gpt, neomycin, hygromycin allows the identification and isolation 
of the transfected cells. 

The transfected gene can also be amplified to express large amounts of the . 
encoded protein. The DHFR (dihydrofolate reductase) marker is useful in developing 
cell lines that carry several hundred or even several thousand copies of the gene of 
interest. (See, e.g., Alt, F. W., et al., J. Biol. Chem. 253:1357-1370 (1978); Hamlin, 
J. L. and Ma, C, Biochem. et Biophys. Acta, 1097:107-143 (1990); Page, M. J. and 
Sydenham, M. A., Biotechnology 9:64-68 (1991).) Another useful selection marker is 
the enzyme glutamine synthase (GS) (Murphy et al., Biochem J: 227:277-279 ( 1991); 
Bebbington et al., Bio/Technology 10:169-175 (1992). Using these markers, the 
mammalian cells are grown in selective medium and the cells with the highest resistance 
are selected. These cell lines contain the amplified gene(s) integrated into a 
chromosome. Chinese hamster ovary (CHO) and NSO cells are often used for the 
production of proteins. 

Derivatives of the plasmid pSV2-dhfr (ATCC Accession No. 37146), the 
expression vectors pC4 (ATCC Accession No. 209646) and pC6 (ATCC Accession 
No.209647) contain the strong promoter (LTR) of the Rous Sarcoma Virus (Cullen et 
al., Molecular and Cellular Biology. 438-447 (March, 1985)) plus a fragment of the 
CMV-enhancer (Boshart et al., Cell 41:521-530 (1985).) Multiple cloning sites, e.g., 
with the restriction enzyme cleavage sites BarnHI, Xbal and Asp71S, facilitate the 
cloning of the gene of interest. The vectors also contain the 3* intron, the 
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polyadenyiation and termination signal of the rat preproinsulin gene, and the mouse 
DHFR gene under control of the SV40 early promoter. 

Specifically, the plasmid pC6, for example, is digested with appropriate 
restriction enzymes and then dephosphorylated using calf intestinal phosphates by 
procedures known in the art. The vector is then isolated from a 1% agarose gel. 

A polynucleotide of the present invention is amplified according to the protocol 
outlined in Example 1. If the naturally occurring signal sequence is used' to produce the 
secreted protein, the vector does not need a second signal peptide. Alternatively, if the 
naturally occurring signal sequence is not used, the vector can be modified. to include a 
heterologous signal sequence. (See, e.g., WO 96/34891.) 

The amplified fragment is isolated from a 1% agarose gel using a commercially 
available kit ("Geneclean," BIO 101 Inc., La Jolla, Ca.)/ The fragment then is digested 
with appropriate restriction enzymes and again purified on a 1% agarose gel. 

The amplified fragment is then digested with the same restriction, enzyme and 
purified on a 1% agarose gel. The isolated fragment and the dephosphorylated vector 
are then ligated with T4 DNA ligase. £. coli HB 101 or XL-1 Blue cells are then 
transformed and bacteria are identified that contain the fragment inserted into plasmid 
pC6 using, for instance, restriction enzyme analysis. 

' Chinese hamster ovary cells lacking an active DHFR gene is used for 
transfection. Five ug of the expression plasmid pC6 is cotransfected with 0.5 ug of the 
plasmid pSVneo using lipofectin (Feigner et ah, supra). The plasmid pSV2-neo 
contains a dominant selectable marker, the neo gene from Tn5 encoding an enzyme that 
confers resistance to a group of antibiotics including G41S. The cells are seeded in 
alpha minus MEM supplemented with 1 mg/ml G418. After 2 days, the cells are 
trypsinized and seeded in hybridoma cloning plates (Greiner, Germany) in alpha minus 
MEM supplemented with 10, 25, or 50 ng/ml of metothrexate plus 1 mg/ml G418. 
After about 10-14 days single clones are trypsinized and then seeded in 6- well petri 
" dishes or 10 ml flasks using different concentrations of methotrexate (50 nM, 100 nM, 
200 nM, 400 nM, 800 nM). Clones growing at the highest concentrations of 
methotrexate are then transferred to new 6-well plates containing even higher 
concentrations of methotrexate (1 uM, 2 uM, 5 uM, 10 mM, 20- mM). The same 
procedure is repeated until clones are obtained which grow at a concentration of 100 - 
200 uM. Expression of the desired gene product is analyzed, for -instance, by S'DS- 
PAGE and Western blot or by reversed phase HPLC analysis. 
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Example 9: Protein Fusions 

The polypeptides of the present invention are preferably fused to other proteins. 
These fusion proteins can be used for a variety of applications. For example, fusion of 
the present polypeptides to His-tag, HA-cag, protein A, IgG domains, and maltose 
5 binding protein facilitates purification. (See Example 5; see also EP A 394,827; ' 

Traunecker, et al„ Nature 331:84-36 (1988).) Similarly, fusion to IgG-l, IgG-3, and 
albumin increases the halflife time in vivo. Nuclear localization signals fused to. the 
polypeptides of the present invention can target the protein to a specific subcellular 
localization, while covaient heterodimer or homodimers can increase or decrease the 

10 activity of a fusion protein. Fusion proteins can also create chimeric molecules having 
more than one function. Finally, fusion proteins can increase solubility and/or stability 
of the fused protein compared to the non-fused protein. All of the types of fusion 
proteins described above can be made by modifying the following protocol, which 
outlines- the fusion of a polypeptide to an IgG molecule, or the protocol described in 

15 Example 5. ' 

Briefly, the human Fc portion of the IgG molecule can be PCR amplified, using 
primers that span the 5' and 3* ends of the sequence described below. These primers 
also should have convenient restriction enzyme sites that will facilitate cloning into ah 
expression vector, preferably a mammalian expression vector. 

20 For example, if pC4 (Accession No. 209646) is used, the human Fc portion can 

be ligated into the BamHI cloning site. Note that the 3* BamHI site should be 
destroyed. Next, the vector containing the human Fc portion is re-restricted with 
BamHI, linearizing the vector, and a polynucleotide of the present invention, isolated 
by the PCR protocol described in Example 1, is ligated into this BamHI site. Note that 

25 the polynucleotide is cloned without a stop codon, otherwise a fusion protein will not 
be produced. 

If the naturally occurring signal sequence is used to produce the secreted 
protein, pC4 does not need a second signal peptide. Alternatively, if the naturally 
occurring signal sequence is not used, the vector can be modified to include a 
30 heterologous signal sequence. (See, e.g., WO 96/34891.) 

Human IgG Fc region: 

GGGATCCGGAGCCCAAATCTTCTGACAAA-^CTCACACATGCCCACCGTGCC 
CAGCACCTGAATTCGAGGGTGCACCGTCAGTCTTCCTCTTCCCCCCAAAACC 
35 CAAGGACACCCTCATGATCTCCCGGACTCCTGAGGTCACATGCGTGGTGGT 
GGACGTAAGCCACGAAGACCCTGAGGTCAAGTTCAACTGGTACGTGGACG 
GCGTGGAGGTGCATAATGCCAAGACAAAGCCGCGGGAGGAGCAGTACAAC 
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agcacgtaccgtgtggtcagcgtcctcaccgtcctgcaccaggactggctg 
aatggc.aaggagtac^aagtgc^aggtctcca^ac.a.^gccctcccaaccccc 
atcgagaaaaccatctccaaagccaaagGgc^gccccgagaaccacaggt 
gtacaccctgcccccatcccgggatgagctgaccaagaaccaggtcagcct 

5 gacctgcctggtcaaaggcttctatccaagcgacatcgccgtggagtggga 
gagcaatgggcagccggagaacaactacaagaccacgcctcccgtgctgg 
actccgacggctccttcttcctctacagc.aagctcaccgtggacaagagca 
ggtggcagcaggggaacgtcttctcatgctccgtgatgcatgaggctctgc 
acaacc\ctacacgcagaagagcctctccctgtcrccgggtaaatgagtgc 

1 0 g acggccgcg actctagaggat (seq id no: 1 ) 

Example 10: Production of an Antibody from a Polypeptide 

The antibodies of the present invention can be prepared by a variety of methods. 
(See, Current Protocols, Chapter 2.) For example, cells expressing a polypeptide of 

15 the present invention is administered to an animal to induce the production of sera 

containing polyclonal antibodies. In a preferred method, a preparation of the secreted 
protein is prepared and purified to render it substantially free of natural contaminants. 
Such a preparation is then introduced into an animal in order to produce polyclonal 
antisera of greater specific activity. 

20 In the most preferred method, the antibodies of the present invention are 

monoclonal antibodies (or protein binding fragments thereof). Such monoclonal 
antibodies can be prepared using hybridoma technology. (Kohler et al., Nature 
256:495 (1975); Kohler et aL, Eur. J. Immunol. 6:51 1 (1976); Kohler et al., Eur. J. 
Immunol. 6:292 (1976); Hammerling et al., in: Monoclonal Antibodies and T-Ceil 

25 Hybridomas, Elsevier, N.Y., pp. 563-681 (1981).) In general, such procedures 
involve immunizing an animal (preferably a mouse) with polypeptide or, more 
preferably, with a secreted polypeptide-expressing cell. Such cells may be cultured in 
any suitable tissue culture medium: however, it is preferable to culture cells in Earle's 
modified Eagle's medium supplemented with 10% fetal bovine serum (inactivated at 

30 about 56° C), and supplemented with about 10 g/1 of nonessential amino acids, about 

1,000 U/ml of penicillin, and about 100 fig/ml of streptomycin. 

The splenocytes of such mice are extracted and fused- with a suitable myeloma 
cell line. .Any suitable myeloma cell line may be employed in accordance with the 
present invention; however, it is preferable to employ the parent myeloma cell line 
35 (SP20), available from the ATCC. After fusion, the resulting hybridoma cells are 
selectively maintained in HAT medium, and then cloned by limiting dilution as 
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described by Wands et al. (Gastroenterology 80:225-232 (1981).) The hybridoma cells 
obtained through such a selection are then assayed to identify clones which secrete 
antibodies capable of binding the polypeptide. 

Alternatively, additional antibodies capable of binding to the polypeptide can be 
5 produced in a two-step procedure using anti-idiocypic antibodies. Such a method 
makes use of the fact that antibodies are themselves antigens, and therefore, it is 
possible to obtain an antibody which binds to a second antibody. In accordance with 
■this method, protein specific antibodies are used to immunize an animal, preferably a 
mouse. The splenocytes of such an animal are then used to produce hybridoma cells, 
10 and the hybridoma cells are screened to identify clones which produce an antibody 

whose ability to bind to the protein-specific antibody can be* blocked by the polypeptide. 
Such antibodies comprise anti-idiotypic antibodies to the protein-specific antibody and 
can be used to immunize an animal to induce formation of further protein-specific 
antibodies. 

15- It will be appreciated that Fab and F(ab')2 and other fragments of the antibodies 

of the present invention may be used according to the methods disclosed herein. Such 
fragments are typically produced by proteolytic cleavage, using enzymes such as papain 
(to produce Fab fragments) or pepsin (to produce F(ab')2 fragments). Alternatively, 
secreted protein-binding fragments can be produced through the application of 

20 recombinant DNA technology or through synthetic chemistry. 

For in vivo use of antibodies in humans, it may be preferable to use 
"humanized'' chimeric monoclonal antibodies. Such antibodies can be produced using 
genetic constructs derived from hybridoma cells producing the monoclonal antibodies 
described above. Methods for producing chimeric antibodies are known>in the art. 

25 . (See, for review; Morrison, Science 229:1202 (1985): Oi et al., BioTechniques 4:214 
(1986); Cabilly et al., U.S. Patent No. 4,816,567; Taniguchi et aL,EP 171496; 
Morrison et al., EP 173494; Neuberger et al., WO 8601533;, Robinson et al., WO 
8702671; Boulianne et al., Nature 312:643 (1984); Neuberger et al., Nature 314:268 
(1985).) 

30 

Example 11: Production Of Secreted Protein For High-Throughput 
Screening Assavs 

The following protocol produces a supernatant containing a polypeptide to be 
tested. This supernatant can then be used in the Screening Assays described in 
35 Examples 13-20: 

First, dilute Poly-D-Lysine (644 587 Boehringer-Mannheim) stock solution 
(lmg/mi in PBS) 1:20 in PBS (w/o calcium or magnesium 17-5 16F Biowhittak'er) for a 



WO 98/54963 



PCT/US9S/H422 



239 

working solution of 50ug/ml. Add 200 ul of this solution to each well (24 well places) 
and mcubate at RT for 20 minutes. Be sure "to distribute the solution over each well 
(note: a 12-channel pipetter may be used with tips on every other channel). Aspirate off 
the Poiv-D-Lysine solution and rinse with lmi PBS (Phosphate Buffered Saline). The 
PBS should remain in the well until just prior to plating the cells and plates may be - 
poly-lysine coated in advance for up co two weeks. 

Place 293T cells (do not carry cells past P+20) at 2 x 10 5 cells/well in .5ml 
DMEMfDuibecco's Modified Eagle Medium)(with 4.5 G/L glucose and L-glutamine 
(I2-604F Biowhittaker))/10% heat inactivated FBSC1 4-503 F Biowhktaker)/lx 
Penstrep(17-602E Biowhittaker). Let the cells grow overnight. 

The next day, mix together in a sterile solution basin: 300 ul Lipofectamine 
( 18324-0 12 Gibco/BRL) and 5ml Optimem I (3 1985070 Gibco/BRL)/96-well plate. 
With a small volume multi-channel pipetter, aliquot approximately '2ug of an expression 
vector containing a polynucleotide insert, produced by the methods described in 
Examples 8 or 9, into an appropriately labeled 96-welI round bottom plate. With a 
multi-channel pipetter, add 50ul of the Lipofectamine/Optimem I mixture to each well. 
Pipette up and down gently to mix. '--Incubate at RT 15-45 minutes. After about 20 
minutes, use a multi-channelpipetter to add 150ul Optimem I to each well. As a 
control,. one plate of vector DNA lacking an insert should be transfected with each set of 

transections. - " " ~ 

Preferably, the transfection should be performed by tag-teaming the following 
tasks. By tag-teaming, hands on time is cut in half, and the cells do not spend too 
much time on PBS. First, person A aspirates off the media from four 24-well plates of r 
cells, and then person B rinses each well with .5-1 ml PBS. Person A then aspirates off 
PBS rinse, and person B, using al2-channel pipetter with dps on every other channel, 
adds the 200ul of DNA/Lipofectamine/Opumem I complex to the odd wells first, then to 
the even wells, to 'each row on the 24-well plates. Incubate at 37°C for 6 hours. 

While cells are incubating, prepare appropriate media, either 1%BSA in DMEM 
with lx penstrep, or CHO-5 media (1 16.6 mg/L of CaC12 (anhyd); 0.00130 mg/L 
CuSO a -5H 2 Q; 0.050 mg/L of Fe(.N0 3 ) 3 -9H 2 0; 0.417 mg/L of FeSO,-7H 2 0; 311.80 
mg/L of Kcl; 28.64 mg/L of MgCU; 48.84 mg/L of MgS0 4 : 6995.50 mg/L of NaCl; 
2400.0 mg/L of NaHCO,; 62.50 mg/L of NaH 2 PO 4 -H 2 0; 71.02 mg/L of N&HP04; 
.4320 mg/L of ZnS0 4 -7H 2 0; .002 mg/L of Arachidonic Acid ; L022 mg/L of 
Cholesterol; .070 mg/L of DL-aipha-Tocopherol-Acetate; 0.0520 mg/L of Linoleic 
Acid: 0.010 mg/L of Linolenic Acid; 0.010 mg/L of Myristic Acid; 0.010 mg/L of Oleic 
Acid; 0.010 mg/L of Palmitric Acid; 0.010 mg/L of Palmitic Acid; 100 mg/L of 
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Pluronic F-6S; 0.0 10 mg/L of Stearic Acid; 2.20 mg/L of T ween 80; 4551 mg/L of D- 
Glucose; 130.85 mg/mi of L- Alanine; 147.50 mg/ml of L-Arginine-HCL; 7.50 mg/ml 
of L-Asparagine-H 2 0; 6.65 mg/ml of L-Aspartic Acid; 29.56 mg/ml of L-Cystine- 
2HCL-H.0; 31.29 mg/ml of L-Cystine-2HCL; 7.35 mg/ml of L-Glutamic Acid; 365.0 
5 mg/ml of L-Glutamine; 18.75 mg/rnl of Glycine; 52.48 mg/ml of L-Histidine-HCL- 
H 2 0; 106.97 mg/ml of L-Isoleucine; .1 11.45 mg/ml of L-Leucine; 163.75 mg/ml of L- 
Lysine HCL; 32.34 mg/ml of L-Methionine; 68.48 mg/ml of L-Phenylaiainine; 40.0 
mg/ml of L-Proline; 26.25 mg/mi of L-Serine; 101.05 mg/ml of L-Threonine; 19.22 
mg/ml of L-Tryptophan;.91.79 mg/mi of L-Tryrosme-2Na-2H,0; 99.65 mg/ml of L- 

10 Valine; 0.0035 mg/L of Biotin; 3.24 mg/L of D-Ca Pantothenate; 1 1.78 mg/L of 
Choline Chloride; 4.65 mg/L of Folic Acid; 15.60 mg/L of i- Inositol; 3.02 mg/L of 
Niacinamide; 3.00 mg/L of Pyridoxai HCL; 0.03 1 mg/L of Pyridoxme HCL; 0.3 19 
mg/L of Riboflavin; 3.17 mg/L of Thiamine HCL; 0.365 mg/L of Thymidine; and 
0.680 mg/L. of Vitamin B 12 ; 25 mM of HEPES Buffer; 2.39 mg/L of iNa Hypoxanthine; 

15 0.105 mg/L of Lipoic Acid; 0.081 mg/L of Sodium Putrescine-2HCL; 55.0 mg/L of 
Sodium Pynjvate; 0.0067 mg/L of Sodium Selenite; 20uM of Ethanolamine; 0. 122 
mg/L of Ferric Citrate; 41.70 mg/L of Methyl-B-Cyciodextrin complexed with Linoieic „ 
Acid; 33.33 mg/L of Methyl-B-Cyclodextrin compiexed with Oleic Acid; and 10 mg/L 
of Methyl-B-Cyclodextrin complexed with Retinal) with 2mm glutamine and lx 

20 penstrep. (BSA (8 1-068-3 Bayer) iOOgm dissolved in 1L DMEM for a 10% BSA stock 
• solution). Filter the media and collect 50 ul for endotoxin assay in 15ml polystyrene 
conical. 

The transfection reaction is terminated, preferably by tag-teaming, at the end of 
the incubation period. Person A aspirates off the transfection media, while person B 

25 adds 1.5 ml appropriate media to each well. Incubate at 37°C for 45 or 72 hours 

depending on the media used: 1 %BS A for 45 hours or CHO-5 for 72 hours. 

On day four, using a 300ul multichannel pipetter, aliquot 600ul in one I ml deep 
well plate and the remaining supernatant into a 2mi deep well. The supematants from 
each well can then be used in the assays described in Examples 13-20. 

30 It is specifically understood that when activity is obtained in any of the assays 

described below using a supernatant, the activity originates from either the polypeptide 
directly (e.g., as a secreted protein) or by the polypeptide inducing expression of other 
proteins, which are then secreted into the supernatant. Thus,, the invention- further 
provides a method of identifying the protein in the supernatant characterized by an 

35 activity in a particular assay. 
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Example 12: Cons t ruct ion of GAS, Ree orter Construct 

One signal transduction pathway involved in the differentiation and oroliferation 
of cells is called the Jaks-STATs pathway. Activated proteins.in the Jaks-STATs 
pathway bind to gamma actuation sue "GAS" elements or interferon-sensitive 
responsive element dSRET). located in the promoter of many genes. The bindina of a 
protein to these elements alter the expression of the associated gene. 

GAS and ISR£ elements are recognized by a class of transcription factors called 
Signal Transducers and Activators of Transcription, or "STATs." There are six 
members of the STATs family. Statl and StaO are present in many cell types, as is 
Stat2 (as response to JPN-alpha is widespread). Start is more restricted and is not in 
many cell types though it has been found in T helper class I, cells after treatment with 
IL- 12. Stao was originally called mammary growth factor, but has been found at 
higher concentrations in other cells including myeloid cells. It can be activated in tissue 
culture cells by many cytokines. 

The STATs are activated to translocate from the cytoplasm to the nucleus upon 
tyrosine phosphorylation by a set of kinases known as the Janus Kinase ("Jaks") 
family. Jaks represent a distinct family of soluble tyrosine kinases and include Tyk2, 
Jakl, Jak2, and Jak3. These kinases display significant sequence similarity and are ' 
generally catalytically inactive in resting cells. 

The Jaks are activated by a wide range of receptors summarized in the Table 
below. (Adapted from review by Schidler and Darnell, Ann. Rev. Biochem. 64:621-51 
(1995).) A cytokine receptor family, capable of activating Jaks, is divided into two 
groups: (a) Class 1 includes receptors for IL-2, IL-3, IL-4, IL-6, IL-7, IL-9, IL-1 1 IL- 
12, IL-15, Epo, PRL, GH, G-CSF, GM-CSF, LIF, CNTF, and thrombopoietin; and 
(b) Class 2 includes IFN-a, IFN-g, and IL-10. The Class 1 receptors share a 
conserved cysteine motif (a set of four- conserved cysteines and one tryptophan) and a 
WSXWS motif (a membrane proxial region encoding Trp-Ser-Xxx-Trp-Ser (SEO ID 
NO:2)). ' ' ■ 

Thus, on binding of a ligand to a receptor, Jaks are activated, which in turn 
activate STATs, which then translocate and bind to GAS elements. This entire process 
is encompassed in the Jaks-STATs signal transduction pathway. 

Therefore - activation of the Jaks-STATs pathway, reflected by the binding of 
the GAS or the ISRE element, can be used to indicate proteins involved in the 
proliferation and differentiation of cells. For example, growth factors and cytokines are 
known to activate the Jaks-STATs pathway. (See Table below.) Thus, by using GAS 
elements linked to reporter molecules, activators of the Jaks-STATs pathway can be 
identified. 
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To construct a synthetic GAS containing promoter element, wiiich is used in the 
Biological Assays described in Examples 13-14, a PCR based strategy is employed to 
senerate a GAS-SV40 promoter sequence. The 5 1 primer contains frur tandem copies 
of the GAS binding site found in the IRFI promoter and previously demonstrated to 
bind STATs upon induction with a range of cytokines (Rothman -t aL, Immunity 
1 :457-468 (1994).), although other GAS or ISRE elements can bt used instead. The 5 1 
primer also contains lSbp of sequence complementary to the SV40XArly promoter 
sequence and is flanked with an Xhol site. .The sequence of the 5 7 prmer is: 
5^GCGCCTCGAGATTTCCCCG.\AATCTAGATTTCCCCGAAATG^TTTCCCCG 
AA^TGATTTCCCCGAA.\TATCTGCCATCTCAATTAG:3 , (SEQ IDNOfJj 

The downstream primer is complementary to the S V40 promoter and is flanked 
with a Hind in site: S^GCGGC AAGCTTTTTGCAAAGCCTAGGC:3' (SEQ ID 
NO:4) . 

PCR amplification is performed using the S V40 promoter template present in 
the B-gal:promoter plasmid obtained from Clontech. The resulting PCR fragment is 
digested with Xhol/Hind III and subcloned into BLSK2-. (Stratagene.) Sequencing 
with forward and reverse primers confirms that the insert contains the following 
sequence: 

5 T : CTCGAG ATTTCCCCGAAATCTAGATTTCCC 

ATTTCCCCGAAATATCTGCCATCTCAATTAGTCAGCAACCATAGTCCCGCCC 
CTAACTCCGCCCATCCCGCCCCTAACTCCGCCCAGTTCCGCCCATTCTCCGC 
CCC ATGGCTG ACTAA' 11 I ' ll 1 1 T ATTT ATGC AG AGGCCG AGGCCGCCTCGGC 
CTCTGAGCTATTCCAGAAGTAGTGAGGAGGCTTTTTTGGAi 
TGC AAA AAGCTT: 3 ' (SEQ ID NO:5) 

With this GAS promoter element linked to the S V40 promotj 
reporter construct is next engineered. Here, the reponer molecule(is a secreted alkaline 
phosphatase, or "SEAR" Clearly, however, any reporter molecule can be instead of 
SEAP, in this or in any of the other Examples. Well known reporter molecules that can 
be used instead of SEAP include chloramphenicol acetyltransfer^Ge (CAT), luciferase, 
alkaline phosphatase, B-galactosidase, green fluorescent protein (GFP), or any protein 
detectable by an antibody. / 

The above sequence confirmed synthetic GAS-S V40 promoter element is 
subcloned into the pSEAP-Promoter vector obtained from Clontech using Hindlll and 
Xhol, effectively replacing the SV40 promoter with the amplified GAS:SV40 promoter 
element, to create the GAS-SEAP vector. However, this vector does not contain a 
neomycin resistance gene, and therefore, is not preferred for mammalian expression 
systems. 
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Thus, in order to generate mammalian stable- cell lines expressing the GAS- 
SEAP reporter, the GAS-SEAP cassette is removed from the GAS-SEAP vector usins 
Sail and NotL and inserted nto a backbone vector containing the neomycin resistance 
gene, such aspGFP-1 (Clonech), using these restriction sites in the multiple cloning 
site, to create the GAS-SEA>/Neo vector. Once this vector is transfected into 
mammalian cells, this vector can then be used as a reporter molecule for GAS binding 
as. described in Example; 13-14. 

Other construes can be made using the above description and replacing GAS 
with a differenc^umocer sequence. For example, construction of reporter molecules 
containiig NFK-B and EGR promoter sequences are described in Examples 15 and 16. 
U/vvever, many other promoters can be substituted using the protocols described in 
these Examples. For instance, SRE, IL-2, NFAT, or Osteocalcin promoters can be 
substituted, alone or in combination (e.g., GAS/NF-KJB/EGR, GAS/NF-KB, II- 
2/NFAT, or NF-KB/GAS). Similarly, other cell lines can be used to test reporter 
construct activity, such as HELA (epithelial), HUVEC (endothelial), Reh (B-celi), 
Saos-2 (osteoblast), HUVAC (aortic), or Cardiomyocyte. 

Example 13: High-Throughput Screening Assav for T-cell Activity. 

The following protocol is used to assess T-celi activity by identifying factors, 
\uch as growth factors and cytokines, that may proliferate or differentiate T-cells. T- 
^ell activity is assessed using the GAS/SEAP/Neo construct produced in Example 12. 
f hu^!^acuS£^ l ^ at increase SEAP activity indicate the ability to activate the Jaks-STATS 
signal transc^tion^athway. The T-cell used in this assay is Jurkat T-cells (ATCC 
Accession No. TIB-13^)-^kough Molt-3 cells (ATCC Accession No. CRJL-1552) and 
Molt-4 cells (ATCC Accession No. CRL-1582) ceils can also be used. 

Jurkat T-cells are lymphoblastic CD4+ Thl helper cells. In order to generate 
stable cell lines, approximately 2 million Jurkat cells are transfected with the GAS- 
SEAP/neo vector using DMRKE-C (Life Technologies)(transfectioh procedure 
described belowJ.^The transfected cells are seeded to a density of approximately 
20,000 cells per well and transfeitants resistant to 1 mg/ml gendcin selected. Resistant 
colonies are expanded and then tested for their response to increasing concentrations of 
interferon gamma. The dose response of a selected clone is demonstrated. 

Specifically, the following protocol will yield sufficient cells for 75 wells 
containing 200 ul of cells. Thus, it is either scaled up, or performed in multiple to 
generate sufficient cells for multiple 96 well plates. Jurkat cells are maintained in RPMI 
-r 10% serum with l%Pen-Strep. Combine 2.5 mis of OPTI-MEM (Life Technologies) 
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with 10 ug of plasmid DN'A in a T25 flask. Add 2.5 ml OPTI-MEM containing 50 ul 

of DMREE-C and incubate at room temperature for 15-45 mins. 

During the incubation period, count cell concentration, spin down the required 

number of ceils (10 7 per transfection), and resuspend in OPTI-MEM to a final 
5 concentration of iO 7 cells/ml. Then add 1ml of 1 x 10 7 cells in OPTI-MEM to T25 flask 

and incubate at 37°C for 6 hrs. After the incubation, add 10 mJ of RPMI + 15% serum. 
The Jurkat:GAS-SEAP stable reporter lines are maintained in RPMI + 10% 

serum, 1 mg/ml Genticin, and 1% Pen-Screp. These cells are treated with supernatants 

containing a polypeptide as- produced by the protocol described in Example 1 1 . 
10 On the day of treatment with the supernatant, the cells should be washed and 

resuspended in fresh RPMI + 10% serum to a density of 500,000 cells per ml. The 

exact number of cells required will depend on the number of supernatants being 

screened. For one 96 well plate, approximately 10 million cells (for 10 plates, 100 

million cells) are required. 
15 Transfer the cells to a triangular reservoir boat, in order to dispense the cells into 

a 96 well dish ? using a 12 channel pipette. Using a 12 channel pipette, transfer 200 ul 

of cells into each well (therefore adding 100, 000 cells per well). 

After all the plates have been seeded, 50 ul of the supernatants are transferred 

directly from the 96 well plate containing the supernatants into each well using a 12 , 
20 channel pipette,. In addition, a dose of exogenous interferon gamma (0.1, 1.0, 10 ng) 

is added to wells H9, H10 7 and HI 1 to serve as additional positive controls for the 

assay. 

The 96 well dishes containing Jurkat cells treated with supernatants are placed in 
an incubator for 48 hrs (note: this time is variable between 4S-72 hrs). 35 ul samples 

25 from each* well are then transferred to an opaque 96 well plate using a 12 channel 

pipette. The opaque plates should be covered (using sellophene covers) and stored at - 
20°C until SEAP assays are performed according to Example 17. The plates 
containing the remaining treated cells are placed at 4°C .and serve as a source of material 
for repeating the assay on a specific well, if desired. 

30 As a positive control, 100 Unit/ml interferon gamma can be used which is 

known to activate Jurkat T cells. Over 30 fold induction is typically observed in the 
positive control wells. 
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Example 14: High-Throughput Screening Assay Identifying Myeloid 
Activity 

The following protocol is used to assess myeloid activity by identifying factors,- 
such as growth factors and cytokines, that may proliferate or differentiate myeloid cells. 
5 Myeloid cell activity is assessed using the GAS/SEAP/Neo construct produced in 

Example 12. Thus, factors that increase SEAP activity indicate the ability "to activate the 
Jaks-STATS signal transduction pathway. The myeloid cell used in this assay is U937 ? 
a pre-monocyte cell line, although TF-1, HL60, or KG1 can be used. 

To transiently transfect U937 cells with the GAS/SEAP/Neo construct produced 
10 in Example 12, a DEAE-Dextran method (Kharbanda et. al., 1994, Cell Growth & 
Differentiation, 5:259-265) is used. First, harvest 2x1 0e~ U937 ceils and wash with 
PBS. The U937 cells are usually grown in RPMI 1640 medium containing 10% heat- 
inactivated fetal bovine serum (FBS) supplemented with 100 units/ml penicillin and 100 
mg/mJ streptomycin. 

15 Next, suspend the cells in 1 ml of 20 mM Tris-HCl (pH 7.4) buffer containing 

0.5 rng/ml DEAE-Dextran, 8 ug GA3-SEAP2 plasmid DNA, 140 mM NaCL 5 mM 

KC1, 375 uM Na2HP04.7H90, 1 mM MgCb, and 675 uM CaCb- Incubate at 37°C 

for 45 min. 

Wash the cells with RPMI 1640 medium containing 10% FBS and then 
20 resuspend in 10 ml complete medium and incubate at 37°C for 36 tir. 

The GAS-SEAP/U937 stable cells are obtained by growing. the cells in 400 
ug/ml G418. The G418-free medium is used for routine growth but every one to two 

months, the cells should be re-grown in 400 ug/ml G4iS for couple of passages. 

s 

These cells are tested by harvesting 1x10 cells (this is enough for ten 96-well 
25 plates assay) and wash with PBS. Suspend the cells in 200 ml above described growth 
medium, with a final density of 5xl0 5 cells/ml. Plate 200 ul cells per well in the 96- 
well plate (or 1x1 0 5 cells/well). 

Add 50 ul of the supernatant prepared by the protocol described in Example 11. 
Incubate at 37°C for 48 to 72 hr. As a positive control, 100 Unit/ml interferon gamma 
30 can be used which is known to activate U937 cells. Over 30 fold induction is typically 
- observed in the positive control wells. SEAP assay the supernatant according to the 
protocol described in Example 17.. 
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Example 15: High-Throughput Screening As say. Identifying Neuronal 
Activity. 

When cells undergo differentiation and proliferation, a group of genes are 
activated through many different signal transduction pathways. One of these genes, 
5 EGRi (early growth response gene 1) ? is induced in various tissues and cell types upon 
activation. The promoter of EGRI is responsible for such induction. Using the EGR I 
promoter linked to reporter molecules, activation of cells can be assessed. 

Particularly, the following protocol is used to assess neuronal activity in PC 12 
cell lines. PC 12 cells (rat phenochromocytoma cells) are known to proliferate and/or 
10 differentiate by activation with a number of mitogens, such as TP A (tetradecanoyl 

phorbol acetate), NGF (nerve growth factor), and EGF (epidermal growth factor). The 
EGRI gene expression is activated during this treatment. Thus, by stably transfecting 
PC 12 cells with a construct containing an EGR promoter linked to SEAP reporter, 
activation of PC 1 2 ceils can be assessed. 
15 The EGR/SEAP reporter construct can be assembled by the following protocol. 

'The EGR-1 promoter sequence (-633 to + l)(Sakamoto K et al., Oncogene 6:867-S71 
(199 1)) can be PCR amplified from-human genomic DNA using the following primers: 
5' GCGCTCGAGGGATGACAGCGATAGAACCCCGG -3'' (SEQ ID NO:6) 
5' GCGAAGCTTCGCGACTCCCCGGATCCGCCTC-3' (SEQIDNO:7) 
20 Using the GAS:SEAP/Neo vector produced in Example 12, EGRI amplified 

product can then be inserted into this vector. Linearize the GAS:SEAP/Neo vector 
using restriction enzymes Xhol/Hindlll, removing the GAS/SV40 stuffer. Restrict the 
EGR 1- amplified product with these same enzymes. Ligate the vector and the EGRI 
promoter. 

25 To prepare 96 well-plates for cell culture, two mis of a coating solution (1:30 

dilution of collagen type I (Upstate Biotech Inc. Cac#08-1 15) in 30% ethanol (filter 
sterilized)) is added per one 10 cm plate or 50 ml per well of the 96-weil plate, and 
allowed to air dry for 2 hr. 

PC 12 cells are routinely grown in RPMM640 medium (Bio Whittaker) 

30 containing 10% horse serum (JRH BIOSCIENCES, Cat. # 12449-78P), 5% heat- 
inactivated fetal bovine serum (FBS) supplemented with 100 units/ml penicillin and 100 
ug/ml streptomycin on a precoated 10 cm tissue culture dish. One to four split is done 
every three to four days. Cells are removed from the plates by scraping and 
resuspended with pipetting up and down for more than 15 times. 

35 - Transfect the EGR/SEAP/Neo construct into PC 12 using the Lipofectamine 
protocol described in Example 11. EGR-SEAP/PC12 stable cells are obtained by 
growing the cells in 300 ug/ml G4 1 8. The G4 18-free medium is used for routine 
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growth but every one to two months, the cells should be re-grown in 300 ug/mi G418 
for couple of passages. 

To assay for neuronal activity, a 10 cm plate with cells around 70 to 80% 
confluent is screened by removing the old medium. Wash the cells once with PBS 
5 (Phosphate buffered saline). Then starve the cells in low serum medium (RPMI-1640 
containing 1% horse serum and 0.5% FBS with antibiotics) overnight. 

The next morning, remove the medium and wash the cells with PBS. Scrape 
off the cells from the plate, suspend the ceils well in 2 ml low serum medium. Count 
the cell number and add more low serum medium to reach final cell density as 5x10^ 
10 cells/ml. 

Add 200 ul of the cell suspension to each well of 96-well plate (equivalent to 
1x10^ cells/well). Add 50 ul supernatant produced by Example 1 1. 37°C for 48 to 72 
hr. As a positive control a growth factor known to activate PC 12 cells through EGR 
can be used, such as 50 ng/ul of Neuronal Growth Factor (NGF). Over fifty-fold 
15 induction of SEAP is typically seen in the positive control wells. SEAP assay the 
supernatant according to Example 17. 

Example 16: High-Throughput Screening Assav for T-cell Activity 

NF-kB (Nuclear Factor kB) is a transcription factor activated by a wide variety 

20 of agents including the inflammatory cytokines IL-1 and TNF, CD30 and CD40, 

lymphotoxin-alpha and lymphotoxin-beta, by exposure to LPS or thrombin, and by 

expression of certain viral gene products. As a transcription factor, NF-kB regulates 

the expression of genes involved in immune cell activation, control of apoptosis (NF- 

kB appears to shield cells from apoptosis), B and T-cell development, anti-viral and 

25 antimicrobial responses, and multiple stress responses. 

In non-stimulated conditions, NF- kB is retained in the cytoplasm with I-kB 

(Inhibitor kB). However, upon stimulation, I- kB is phosphorylated and degraded, 

causing NF- kB to shuttle to the nucleus, thereby activating transcription of target 

genes. Target genes activated by NF- kB include EL-2, DL-6, GM-CSF, ICAM-1 and 

30 class 1 MHC. 

Due to its central role and ability to respond to a range of stimuli, reporter 

constructs utilizing the NF-kB promoter element are used to screen the supernatants 

produced in Example 11.. Activators or inhibitors of NF-kB would be useful in treating 
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diseases. For example, inhibitors of NF-kB could be used to treat those diseases 
related to the acute or chronic activation of NF-kB, such as rheumatoid arthritis. 

To construct a vector containing the NF-icB promoter element, a PCR based 

strategy is employed. The upstream primer contains four tandem copies of the NF-kB 

5 binding site (GGGGACTTTCCC) (SEQ ID NO:S), 18 bp of sequence complementary 
to the 5" end of the SV40 early promoter sequence, and is flanked with an Xhol site: 
5*:GCGGCCTCGAGGGGACTTTCCCGGGGACTTTCCGGGGACTTTCCGGGAC 
TTTCCATCCTGCCATCTCAATTAG:3 ' (SEQ ID NO:9) 

The downstream primer is complementary to the 3 r end of the SV40 promoter 
10 and is flanked with a Hind III site: 

5 * :GCGGC AAGCTI ITTGC AAAGCCT AGGQ3 ' (SEQ ID NO:4) 

PCR amplification is performed using the SV40 promoter template present in 
the pB-gal:promoter plasmid obtained from Clontech. The resulting PCR fragment is 
digested with Xhol and Hind III and subcloned into BLSK2-. (Stratagene) 
15 Sequencing with the T7 and T3 primers confirms the insert contains the following 
sequence: 

5':CTCGAGGGGACTTTCCCGGGGACTTTCCGGGGACTTTCCGGGACTTTCC 
ATCTGCCATCTCAATTAGTCAGCA.ACCATAGTCCCGCCCCTAACTCCGCCCA 
20 TCCCGCCCCTAACTCCGCCCAGTTCCGCCCATTCTCCGCCCCATGGCTGACT 
AATTTTT1 1TATTTATGCAGAGGCCGAGGCCGCCTCGGCCTCTGAGCTATTC 
CAGAAGTAGTGAGGAGGCTTTTTrGGAGGCCTAGGCTTTT 
3' (SEQ ID NO: 10) 

25 Next, replace the S V40 minimal promoter element present in the pSEAP2- 

promoter plasmid (Clontech) with this NF-KB/SV40 fragment using Xhol and HindllL 

However, this vector does not contain a neomycin resistance gene, and therefore, is not 
preferred for mammalian expression systems. 

In order to generate stable mammalian cell lines, the NF-KB/SV40/SEAP 

30 cassette is removed from the above NF-kB/SEAP vector using restriction enzymes Sail 
and NotI, and inserted into a vector containing neomycin resistance. Particularly, the 
NF- KB /S V40/S E A P cassette was inserted into pGFP-1 (Clontech), replacing the GFP 
gene, after restricting pGFP-1 with Sail and Notl. 
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Once NF-kB/S V40/SEAP/Neo vector is created, stable Jurkat T -cells are 

created and maintained according to the protocol described in Example 13. Similarly, 
the method for assaying supematants with these stable Jurkat T-cells is also described 
in Example 13. As a positive control, exogenous TNF alpha (0.1,1, 10 ng) is added to 
5 wells H9, H10, and HI 1, with a 5-10 fold activation typically observed. 

Example 17: Assay for SEAP Activity 

As a reporter molecule for the assays described in Examples 13-16, SEAP 
activity is assayed using the Tropix Phospho-light Kit (Cat. BP-400) according to the 
10 following general procedure. The Tropix Phospho-light Kit supplies the Dilution, 
Assay, and Reaction Buffers used below. 

Prime a dispenser with the 2.5 x Dilution Buffer and dispense 15 ul of 2.5x 
dilution buffer into Optiplates containing 35 |il of a supernatant. Seal the plates with a 

plastic sealer and incubate at 65°C for 30 min. Separate the Optiplates to avoid uneven 
15 heating. 

"Cool the samples to room temperature for 15 minutes. Empty the dispenser and 

prime with the Assay Buffer. Add 50 \il Assay Buffer and incubate at room 

temperature 5 min. Empty the dispenser and prime with the Reaction Buffer (see the 

table below). Add 50 ui Reaction Buffer and incubate at room temperature for 20 

20 minutes. Since the intensity of the chemiluminescent signal is time dependent, and it 

takes about 10 minutes to read 5 plates on luminometer, one should- treat 5 plates at each 
time and start the second set 10 minutes later. 

Read the relative light unit in the luminometer. Set H12 as blank, and print the 
results. An increase in chemiluminescence indicates reporter activity. 

25 

Reaction Buffer Formulation: 

g of places R.xn buffer diluent (ml) CSPD (ml) 
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Example 18: High-Throughput Screening Assav Identifying Changes in 
Small Molecule Concentration and Membrane Permeability 

Binding of a ligand to a receptor is known to alter intracellular levels of small 
5 molecules, such as calcium, potassium, sodium, and pH, as well as alter membrane 
potential. These alterations can be measured in an assay to identify supernatants which 
bind to receptors of a particular ceil. Although the following protocol describes an 
assay for calcium, this protocol can easily be modified to detect changes in potassium, 
sodium, pH, membrane potential, or any other small molecule which is detectable bv a 
10 fluorescent probe. 

The following assay uses Fluorometric Imaging Plate Reader ("FLIPR") to 
measure changes in fluorescent molecules (Molecular Probes) that bind small 
molecules. Clearly, any fluorescent molecule detecting a small molecule can be used 
instead of the calcium fluorescent molecule, fluo-3, used here. 
15 For adherent cells, seed the cells at 10,000 -20,000 cells/well in a Co-star black 

96-well plate with clear bottom. The plate is incubated in a CO. incubator for 20 hours. 
The adherent cells are washed two times in Biotek washer with 200 ul of HBSS 
(Hank's Balanced Salt Solution) leaving 100 ul of buffer after the final wash. - 
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A stock solution of 1 mg/ml fluo-3 is made in 10% pluronic acid DMSO. To 
load the ceils with fluo-3 , 50 ul of 12 ug/ml fluo-3 is added to each well. The piate is 
incubated at 37°C in a C0 2 incubator for 60 min. The plate is washed four times in the 
Biotek washer with HBSS leaving 100 ul of buffer. 
5 For non-adherent cells, the ceils are spun down from culture media. Cells are 

re-suspended to 2-5xl0 6 cells/ml with HBSS in a 50-ml conical tube. 4 ul of 1 mg/ml 
fluo-3 solution in 10% pluronic acid DMSO is added to each ml of cell suspension. 
The tube is then placed in a 37°C water bath for 30-60 min. The cells are washed twice 
with HBSS, resuspended to lxlO 6 cells/ml, and dispensed into a microplate, 100 
10 ul/well. The plate is centrifuged at 1000 rpm for 5 min. The plate is then washed once 
in Denley CellWash with 200 ul", followed by an aspiration step- to 100 ul final volume. 

For a non-cell based assay, each. well contains a fluorescent molecule, such as 
fluo-3. The supernatant is added to the well, and a change in fluorescence is detected. 

To measure the fluorescence of intracellular calcium, the FLIPR is set for the 
15 following parameters: (1) System gain is 300-800 mW; (2) Exposure time is 0.4 

second; (3) Camera F/stop is F/2; (4) Excitation is 488 nm; (5) Emission is 530 nm; and 
(6) Sample addition is 50 ul. Increased emission at 530-nm indicates an extracellular 
signaling event which has resulted in an increase in the intracellular Ca"^ 
concentration. 

20 

Example 19: High-Throughput Screening Assav Identifying Tyrosine 
Kinase Activity 

The Protein Tyrosine Kinases (PTK) represent a diverse group of 
transmembrane and cytoplasmic kinases. Within the Receptor Protein Tyrosine Kinase 

25 RPTK) group are receptors for a range of mitogenic arid metabolic growth factors 
including the PDGF, FGF, EGF, NGF, HGF and Insulin receptor subfamilies. In 
addition there are a large family of RPTKs for which the corresponding ligand is 
unknown. Ligands for RPTKs include mainly secreted small proteins, but also 
membrane-bound and extracellular matrix proteins. 

30 Activation of RPTK by ligands involves ligand-mediated receptor dimerization, 

resulting in transphosphorylation of the receptor subunits and activation of the 
cytoplasmic tyrosine kinases. The cytoplasmic tyrosine kinases include receptor 
associated tyrosine kinases of the src-family (e.g., src, yes, lck, lyn, fyn) and non- 
receptor linked and cytosolic protein tyrosine kinases, such as the Jak family, members 

35 of which mediate signal transduction triggered by the cytokine superfamily of receptors 
(e.g., the Interleukins, Interferons, GM-CSF, and Leptin). 
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Because of the wide range of known factors capable of stimulating tyrosine 
kinase activity, the identification of novel human secreted proteins capable of activating 
tyrosine kinase signal transduction pathways are of interest- Therefore, the following 
protocol is designed to identify those novel human secreted proteins capable of 
activating the tyrosine kinase signal transduction pathways. 

Seed target cells (e.g., primary keraunocytes) at a density of approximately 
25,000 cells per well in a 96 well Loprodyne Silent Screen Plates purchased from 
Nalge Nunc (Naperville, 111). The plates are sterilized with- two 30 minute rinses with 
100% ethanol, rinsed with water and dried overnight. Some plates are coated for 2 hr 
with 100 ml of cell culture grade type I collagen (50 mg/ml), gelatin (2%) or polylysine 
(50 mg/ml), all of which can be purchased from Sigma Chemicals (St. Louis, MO) or 
10% Matrigel purchased from Becton Dickinson (BedfordaMA), or calf serum, rinsed 
with PBS and stored at 4°C. Cell growth on these plates is assayed by seeding 5,000 . 
-cells/well in growth medium and indirect quantitation of cell number through use of 
alamarBlue as described by the manufacturer Alamar Biosciences, Inc. (Sacramento, 
CA) after 48 hr. Falcon plate covers #3071 from Becton Dickinson (Bedford,MA) are 
used to cover the Loprodyne Silent Screen Plates. Falcon Microtest III cell culture 
plates can also be used in some proliferation experiments. 

To prepare extracts, A43 1 cells are seeded onto the nylon membranes of 
Loprodyne plates (20,000/200ml/well) and cultured overnight in complete-medium. 
Cells are quiesced by incubation in serum- free basal medium for 24 hr. After 5-20 
minutes treatment with EGF (60ng/ml) or 50 ul of the supernatant produced in Example 
1 1, the medium was removed and 100 ml of extraction buffer ((20 mM HEPES pH 
7.5, 0.15 M NaCl, 1% Triton X-100-, 0.1% SDS, 2 mM Na3V04, 2 mM Na4P207 
and a cocktail of protease inhibitors (# 1836170) obtained from Boeheringer Mannheim 
(Indianapolis, IN) is added to each well and the plate is shaken on a rotating shaker for 
5 minutes at 4°C. The plate is then placed in a vacuum transfer manifold and the extract 
filtered through the 0.45 mm membrane bottoms of each well using house vacuum. 
Extracts are collected in a 96-weil catch/assay plate in the bottom of the vacuum 
manifold and immediately placed on ice. To obtain extracts clarified by centrifugation, 
the content of each well, after detergent solubilization for 5 minutes, is removed and 

centnfuged for 15 minutes at 4°C at 16,000 x g. 

Test the filtered extracts for levels of tyrosine kinase activity. Although many 
methods of detecting tyrosine kinase activity are known, one method is described here. 

Generally, the tyrosine kinase activity of a supernatant is evaluated by 
determining its ability to phosphorylate a tyrosine residue on a specific substrate (a 
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biotiny laced peptide). Biotinylated peptides that can be used for this purpose include 
PSK1 (corresponding to amino acids 6-20 of the cell division kinase cdc2-p34) and 
PSK2 (corresponding to amino acids 1-17 of gastrin). - Both peptides are substrates for 
a range of tyrosine kinases and are available from Boehringer Mannheim. 
5 The tyrosine kinase reaction is set up by adding the following components in 

order. First, add lOul of 5uM Biotinylated Peptide, then lOul ATP/Mg?v(5mM 
ATP/50mM MgCh), then lOul of 5x Assay Buffer (40mM imidazole hydrochloride, 
pH7.3, 40 miVl beta-glycerophosphate, ImM EGTA, lOOmM MgCK 5 rruVl MnCl? 
0.5 mg/ml BSA), then 5ui of Sodium Vanadate(lmM), and then 5ul of water. Mix the 

10 components gently and preincubate the reaction mix at 30°C for 2 min. Initial the 
reaction by adding lOul of the control enzyme or the filtered supernatant. 

The tyrosine kinase assay reaction is then terminated by adding 10 ul of 120mm 
EDTA and place the reactions on ice. 

Tyrosine kinase activity is determined by transferring 50 ui aliquot of reaction 

15 mixture to a microtiter piate (MTP) module and incubating at 37°C for 20 min. This 
allows the streptavadin coated 96 well plate to associate with the biotinylated peptide. 
Wash the MTP module with 300ui/well of PBS four times. Next add 75 ul of and- 
phospotyrosine antibody conjugated to horse radish peroxidase(anti-P-Tyr- 

POD(0.5u/ml)) to each well and incubate at 37°C for one hour. Wash the well as 
20 above. 

Next add lOOul of peroxidase substrate solution (Boehringer Mannheim) and 
incubate at room temperature for at least 5 mins (up to 30 min). Measure the 
. absorbance of the sample at 405 nm by using ELISA reader. The level of bound 
peroxidase activity is quantitated using an ELISA reader and reflects the level of 
25 tyrosine kinase activity. 

Example 20: High-Throughput Screening Assav Identifying 
Phosphorylation Activity 

As a potential alternative and/or compliment to the assay of protein- tyrosine 
30 kinase activity described in Example 19, an assay which detects activation 

(phosphorylation) of major intracellular signal transduction intermediates can also be 
used. For example, as described below one particular assay can detect tyrosine 
phosphorylation of the Erk-1 and Erk-2 kinases. However, phosphorylation of other 
molecules, such as Raf, JNK, p3S MAP, Map kinase kinase (MEK), MEK kinase, 
35 Src, Muscle specific kinase (MuSK), IRAK, Tec, and Janus, as well as any other 
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phosphoserine, phosphotyrosine, or phosphothreonine molecule, can be detected by 
substituting these molecules for Erk-lor Erk-2 in the following assay. 

Specifically, assay plates are made by coating the wells of a 96-well ELISA 
plate with 0. lmJ of protein G ( iug/ml) for 2 hr at room temp, (RT). The plates are then 
5 rinsed with PBS and blocked with 3% BSA/PBS for 1 hr at RT. The protein G plates 
are then treated with 2 commercial monoclonal antibodies ( lOOng/well) against Erk-1 
and Erk-2 (1 hr at RT) (Santa Cruz Biotechnology). (To detect other molecules, this 
step can easily be modified by substituting a. monoclonal antibody detecting any of the 

above described molecules.) After 3-5 rinses with PBS, the plates are stored at 4°C 
10 until use. 

A43 1 cells are seeded at 20,000/well in a 96-well Loprodyne fiiterplaie and 
cultured overnight in growth medium. The cells are then starved for 48 hr in basal 
medium (DMEM) and then treated with EGF (6ng/well) or 50 ul of the supernatants 
obtained in Example 1 1 for 5-20 minutes. The cells are then solubilized and extracts 

15 filtered directly into the assay plate. 

After incubation with the extract for 1 hr at RT, the wells are again rinsed. As a 
positive control, a commercial preparation of MAP kinase (10ng/well) is used in place 
of A43 1 extract. Plates are then treated with a commercial polyclonal (rabbit) antibody 
( 1 ug/ml) which specifically recognizes the phosphorylated "epitope of the Erk- 1 and 

20 Erk-2 kinases (1 hr at RT). This antibody is biotinyiated by standard procedures. The 
bound polyclonal antibody is then quantitated by successive incubations with 
Europium-streptavidin and Europium fluorescence enhancing reagent in the Wallac 
DELFIA instrument (time-resolved fluorescence). An increased fluorescent signal over 
background indicates a phosphorylation. 

25 

Example 21: Method of Determining Alterations in a Gene 
Corresponding to a Polynucleotide 

RNA isolated from- entire families or individual patients presenting with a 
phenotype of interest (such as a disease) is be isolated.. cDNA is then generated from 
30 these RNA samples using protocols known in the art. (See, Sambrook.) The cDNA is 
then used as a template for PCR, employing primers surrounding regions of interest in 

SEQ ID NO.X. Suggested PCRconditions consist of 35 cycles at 95°C for 30. 

seconds; 60-120 seconds at 52-58°C; and 60-120 seconds at 70°C, using buffer 

solutions described in Sidransky, D., et aL, Science 252:706 (1991). 
35 PCR products are then sequenced using primers labeled at their 5' end with T4 

polynucleotide kinase, employing SequiTherm Polymerase. (Epicentre Technologies). 
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The intron-exon borders of selected exons is aiso determined and genomic PCR 
products analyzed to confirm the results. PCR products harboring suspected mutations 
is then cloned and sequenced to validate the results of the direct sequencing. 

PCR products is cloned into T-tailed vectors as described in Holton, T.A. and 
5 Graham, M,W., Nucleic Acids Research, 19: 1 156 (199 1) and sequenced with T7 
polymerase (United States Biochemical). Affected individuals are identified by 
mutations not present in unaffected individuals. 

Genomic rearrangements are also observed as a method of determining 
alterations in a' gene corresponding to a polynucleotide. Genomic clones isolated 

10 according to Example 2 are nick-translated with digoxigenindeoxy-uridine 5'- 

triphosphate (Boehringer Manheim), and FISH performed as described in Johnson, 
Cg. et al.. Methods Cell Biol. 35:73-99 (1991). Hybridization with the labeled probe is 
carried out using a vast excess of human cot-1 DNA for specific hybridization to the 
corresponding genomic locus. 

15 Chromosomes axe counterstained with 4,6-diamino-2-phenyiidole and 

propidium iodide, producing a combination of C- and R-bands. Aligned images for 
precise mapping are obtained using a triple-band filter set (Chroma Technology, 
Brattleboro, VT) in combination with a cooled charge-coupled device camera 
(Photometries, Tucson, AZ) and variable excitation wavelength filters. (Johnson, Cv. 

20 et al., Genet. Anal. Tech. Appl., 8:75 (1991).) Image collection, analysis and 

chromosomal fractional length measurements are performed using the ISee Graphical 
Program System. (Inovision Corporation, Durham, NC) Chromosome alterations of 
the genomic region hybridized by the probe are identified as insertions, deletions, and 
translocations. Tnese alterations are used as a diagnostic marker for an associated 

25 disease. 

Example 22: Method of Detecting Abnormal Levels of a Polypeptide in a 
Biological Sample 

A polypeptide of the present invention can be detected in a biological sample, 
30 and if an increased or decreased level of the polypeptide is detected, this polypeptide is 
a marker for a particular phenotype. Methods of detection are numerous, and thus, it is 
understood that one skilled in the art can modify the following assay to fit their 
particular needs. 

For example, antibody-sandwich ELIS As are used to detect polypeptides in a 
35 sample, preferably a biological sample. Wells of a microtiter plate are coated -with 

specific antibodies, at a final concentration of 0.2 to 10 ug/mi. The antibodies are either 
monoclonal or polyclonal and are produced by the method described in Example 10. 
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The wells are blocked so that non-specific binding of the polypeptide to the well is 
reduced. 

The coated wells are then incubated for > 2 hours at RT with a sample 
containing the polypeptide. Preferably, serial dilutions of the sample should be used to 
validate results. The plates are then washed three times with deionized or distilled water 
to remove unbounded polypeptide. 

Next, 50 ul- of specific antibody-alkaline phosphatase conjugate, at a 
concentration of 25-400 ng, is added and incubated for 2 hours at room temperature. 
The plates are again washed three times with deionized or distilled water to remove 
unbounded conjugate. 

Add 75 ul of 4-methylumbelliferyl phosphate (MUP) or p-nitrophenyl 
phosphate (NPP) substrate solution to each well and incubate 1" hour at room 
temperature. Measure the reaction by a microtiter plate reader. Prepare a standard 
curve, using serial dilutions of a control sample, and plot polypeptide concentration on . 
the X-axis (log scale) and fluorescence or absorbance of the Y-axis (linear scale). 
Interpolate the concentration of the polypeptide in the sample using the standard curve. 

Example 23: Formulating a Polypeptide 

The secreted polypepude composition will be formulated and dosed in a fashion 
consistent with good medical practice, taking into account the clinical condition of the 
individual patient (especially the side effects of treatment with the secreted polypepude 
alone), the site of delivery, the method. of administration, the scheduling of 
administration, and other factors known to practitioners. The "effective amount" for 
purposes herein is thus determined by such'considerations. 

As a general proposition, the'total pharmaceutical^ effective amount of secreted 
polypeptide administered parenterally per dose will be in the-range of about 1 p.g/kg/day 
to 10 mg/kg/day of patient body weight, although, as noted above, this will be subject 
to therapeutic discretion. More preferably, this dose is at least 0.01 mg/kg/day, and 
most preferably for humans between about 0.01 and 1 mg/kg/day for the hormone. If 
given continuously, the secreted polypeptide is. typically administered at a dose rate of 
about 1 jig/kg/hour to about 50 p.g/kg/hour, either by 1-4 injections per day or by 
continuous subcutaneous infusions, for example, using a mini-pump. An intravenous 
bag solution may also be employed. The length of treatment needed to observe changes 
and the interval following treatment for responses to occur appears to vary depending 
on the desired effect. 

Pharmaceutical compositions containing the secreted protein of the invention are 
administered orally, rectally, parenterally, intracistemally. intravaginally. 
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intraperitoneally, topically (as by powders, ointments, gels, drops or transdermal 
patch), bucally, or as an oral or nasal spray. "Pharmaceucically acceptable carrier" refers 
to a non-toxic solid, semisolid or liquid filler, diluent, encapsulating material or 
formulation auxiliary of any type. The term "parenteral" as used herein refers to modes 
of administration which include intravenous, intramuscular, intraperitoneal, intrasternal, 
subcutaneous and intraarticular injection and infusion. 

The secreted polypeptide is also suitably administered by sustained-release 
systems. Suitable examples of. sustained-release compositions include semi-permeabie 
polymer matrices in the form of shaped articles, e.g., films, or mirocapsules. 
Sustained-release matrices include polylactides (U.S. Pat. No. 3,773,919, EP 58,480, 
copolymers of L-glutamic acid and gamma-ethyl-L-glutamate (Stdman, U. et al., 
Biopolymers 22:547-556 (1983)), poly (2- hydroxyethyi methacryiate) (R. Langeret 
al., J. Biomed. Mater. Res. 15:167-277- (1981), and R. Langer, Chem. Tech. 12:98- 
105 (1982)), ethylene vinyl acetate (R. Langer et al.) or poIy-D- (-)-3-hydroxybutyric 
acid (EP 133,988). Sustained-release compositions also include liposomally entrapped 
polypeptides. Liposomes containing the secreted polypeptide are prepared by methods 
known per se: DE 3,218,121 ; Epstein et al., Proc. Natl. Acad. Sci. USA 82:368 8-3692 
(1983)THwang et al., Proc. Natl. Acad. Sci. USA 77:4030-4034 (1980); EP 52,322; 
EP 36,676; EP 88,046; EP 143,949; EP 142,641; Japanese Pat. Appl. 83-118008; 
U.S. Pat. Nos. 4,485,045 and 4,544,545; and EP 102,324. Ordinarily, the liposomes 
are of the small (about 200-800 Angstroms) unilamellar type in which the lipid content 
is greater than about 30 mol. percent cholesterol, the selected proportion being adjusted 
for the optimal secreted polypeptide therapy. 

For parenteral administration, in one embodiment,. the secreted polypeptide is 
formulated generally by mixing it at the desired degree of purity, in a unit dosage - 
injectable form (solution, suspension, or emulsion), with a pharmaceutically acceptable 
carrier, i.e., one that is non-toxic to recipients at the dosages and concentrations 
employed and is compatible with other ingredients of the formulation. For example, the 
formulation preferably does not include oxidizing agents and other compounds that are 
known to be deleterious to polypeptides. 

Generally, the formulations are prepared by contacting the- polypeptide 
uniformly and intimately with liquid carriers or finely divided solid carriers or both. 
Then, if necessary, the product is shaped into the desired formulauon. Preferably the 
carrier is a parenteral carrier, more preferably a solution that is isotonic with the blood 
of the recipient. Examples of such carrier vehicles include water, saline, Ringer's 
solution, and dextrose solution. Non-aqueous vehicles such as fixed oils and ethyl 
oleate are also useful herein, as well as liposomes. 
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The carrier suitably contains minor amounts of additives such as substances that 
enhance isotonicity and chemical stability. Such materials are non-toxic to recipients at 
the dosages and concentrations employed, and include buffers such as phosphate, 
citrate, succinate, acedc acid, and other organic acids or their salts; antioxidants such as 
ascorbic acid; low molecular weight (less than about ten residues) polypepddes 7 e.g., 
polyarginine or tripeptides; proteins, such as serum albumin, gelatin, or ■ 
immunoglobulins; hydrophilic polymers such as polyvinylpyrrolidone; amino acids, 
such as glycine, glutamic acid, aspartic acid, or arginine; monosaccharides, 
disaccharides, and other carbohydrates including cellulose or its derivatives, glucose, 
manose, or dextrins; chelating -.agents such as EDTA; sugar alcohols such as mannitol or 
sorbitol; counterions such as sodium; and/or nonionic surfactants such as polysorbates, 
poloxamers, or PEG. 

The secreted polypeptide is typically formulated in such vehicles at a 
concentration of about 0.1 mg/ml to 100 mg/ml, preferably 1-10 mg/mi, at a pH of 
about 3 to 8. It will be understood that the use of certain of the foregoing excipients, 
carriers, or stabilizers will result in the formation of polypeptide salts. 

Any polypeptide to be used for therapeutic administration can be sterile. 
Sterility is readily accomplished by filtration through sterile filtration membranes (e.g., 
"0.2 micron membranes). Therapeutic polypeptide compositions generally are placed 
into a container having a sterile access port, for example, an intravenous solution bag or 
vial having a stopper pierceable by a hypodermic injection needle. 

Polypeptides ordinarily will be stored in unit or multi-dose containers, for 
example, sealed ampoules or vials, as an aqueous solution or as a lyophilized 
formulation for reconstitution. As an example of a lyophilized formulation, 10-mJ vials 
are filled with 5 ml of sterile-filtered 1% (w/v) aqueous polypeptide solution, and the 
resulting mixture is lyophilized. The infusion solution is prepared by reconstituting the 
lyophilized polypeptide using bacteriostatic Water-for-Injection. 

The invention also provides a pharmaceutical pack or kit comprising one or 
more containers filled with one or more of-the ingredients of the pharmaceutical 
compositions of the invention. Associated with such container(s) can be a notice in the 
form prescribed by a governmental agency regulating the manufacture, use or sale of 
pharmaceuticals or biological products, which notice reflects approval by the agency of 
manufacture, use or sale for human administration. In addition, the polypeptides of the 
present invention may be employed in conjunction with other therapeutic compounds. 
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Example 24: Method of Treating Decreased Levels of the Polypep tide 
It will be appreciated that conditions caused by a decrease in the standard or 

normal expression level of a secreted protein in an individual can be treated by 

administering the polypeptide of the present invention, preferably in the secreted form. 
5 Thus, the invention also provides a method of treatment of an individual in need of an 

increased level of the polypeptide comprising administering to such an individual a 

pharmaceutical composition comprising an amount of the polypeptide to increase the 

activity level of the polypeptide in such an individual. 

For example, a patient with decreased levels of a polypeptide receives a daily 
10 dose 0. 1-100 ug/kg of the polypeptide for six consecutive days. Preferably, the 

polypeptide is in the secreted form. The exact details of the dosing scheme, based on 

administration and formulation, are provided in Example 23. 

Example 25: Method of Treating Increased Levels of the Polypeptide 

15 Antisense technology is used to inhibit production of a polypeptide of the 

present invention. This technology is one example of a method of decreasing levels of 
a polypeptide, preferably a secreted 'form, due to a variety of etiologies, such as cancer. 

For example, a patient diagnosed with abnormally increased levels of a 
polypeptide is administered intravenously antisense polynucleotides at 0.5, 1.0, 1.5, 

20 2.0 and 3.0 mg/kg day for 21 days. This treatment is repeated after a 7-day rest period 
if the treatment was well tolerated. The formulation of the antisense polynucleotide is 
provided in Example 23. 

Example 26: Method of Treatment Using Gene Therapy 

25 One method of gene therapy transplants fibroblasts, which are capable of 

expressing a polypeptide, onto a patient. Generally, fibroblasts are obtained from a 
subject by skin biopsy. The resulting tissue is placed in tissue -culture medium and 
separated into small pieces. Small chunks of the tissue are placed on a wet surface of a 
tissue culture flask, approximately ten pieces are placed in each flask. The flask is 

30 turned upside down, closed tight and left at room temperature over night. After 24 

hours at room temperature,, the flask is inverted and the chunks of tissue remain fixed to 
the bottom of the flask and fresh media (e.g.. Ham's F12 media, with 10% FBS, 

penicillin and streptomycin) is added. The flasks are then incubated at 37°C for 

approximately one week. 
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Ac this time, fresh media is added and subsequently changed every several days. 
After an additional two weeks in culture, a monolayer of fibroblasts emerge. The 
monolayer is trypsinized and scaled into larger flasks. 

pMV-7 (Kirschmeier: PT. et al., DNA. 7:219-25 (1988)), flanked by the long 
5 terminal repeats of the Moloney murine sarcoma virus, is digested with EcoRI and 
Hindlll and subsequently treated with calf intestinal phosphatase. The linear vector is 
fractionated on agarose gel and purified, using glass beads. 

The cDNA encoding a polypeptide of the present invention can be amplified 
using PCR primers which correspond to the 5 ! and 3' end sequences respectively as set 
10 forth in Example 1. Preferably, the 5 ? primer contains an EcoRI site and the 3' primer 
includes a Hindlll site. Equal quantities of the Moloney murine sarcoma virus linear 
backbone and the amplified EcoRI and Hindlll fragment are added together, in the 
presence of T4.DNA ligase. The resulting mixture is maintained under conditions 
appropriate for ligation of the two fragments. The ligation mixture is then used to 
15 transform bacteria HB 101, which are then plated. onto agar containing kanamycin for 
the purpose of confirming that the vector has the gene of interest properly inserted. 

The amphotropic pA3 17 or GP-raml2 packaging- cells are grown in tissue 
culture to confluent density in Dulbecco's Modified Eagles Medium (DMEM) with 10% 
calf serum (CS), penicillin and streptomycin. The MSV vector containing the gene is 
20 then added to the media and the packaging cells transduced with the vector. The 
packaging cells now produce infectious viral particles containing the gene (the 
packaging cells are now referred to as producer cells). 

Fresh media is added to the transduced producer cells, and subsequently, the 
media is harvested from a 10 cm plate of confluent producer cells. The spent media, 
25 containing the infectious viral particles, is filtered through a millipore filter to remove 
detached producer cells and this media is then used to infect fibroblast cells. Media is 
removed from a sub-confluent plate of fibroblasts and quickly replaced with the media 
from the producer cells. This media is removed and replaced with fresh media. If the 
titer of virus is high, then virtually all fibroblasts will be infected and no selection is 
30 required. If the titer is very low, then it is necessary to use a retroviral vector that has a 
selectable marker, such as neo or his. Once the fibroblasts have been efficiently 
infected, the fibroblasts are analyzed to determine whether protein is produced. 

The engineered fibroblasts are then transplanted onto the host, either alone or 
after having been grown to confluence on cytodex 3 microcarrier beads. 

35 
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Example 27: Method of Treatment Using Gene Therapy - In Vivo 

Another aspect: of the present invention is using in vivo gene therapy 
methods to treat disorders, diseases and conditions. The gene therapy method 
relates to the introduction of naked nucleic acid (DNA, RNA, and ancisense 
5 DNA or RNA) sequences into an animal to increase or decrease the expression 
of the polypeptide of the present invention. A polynucleotide of the present 
invention may be operatively linked to a promoter or any other genetic elements 
necessary for the expression of the encoded polypeptide by the target tissue. 
Such gene therapy and delivery techniques and methods are known in the an, 

10 see, for example, WO90/1 1092, W098/1 1779; U.S. Patent NO. 5693622, - 
5705151, 5530859; Tabata H. et ai. (1997) Cardiovasc. Res. 35(3):470-479, 
Chao J et al. (1997) Pharmacol. Res. 35(6):5 17-522, Wolff LA. (1997) 
NeuromuscuL Disord. 7(5):3 14-3 18, Schwartz B. et al. (1996) Gene Ther. 
3(5):405-4i 1, Tsurumi Y. et al. (1996) Circulation 94(12):32S 1-3290 

15 (incorporated herein by reference). 

The polynucleotide constructs of the present invention-may be delivered 
by any method that delivers injectable materials to the cells of an animal, such 
as, injection into the interstitial space of tissues (heart, muscle, skin, lung, liver, 
intestine and the like). These polynucleotide constructs can be delivered in a 

20 pharmaceutically acceptable liquid or aqueous carrier. 

The term "naked" polynucleotide, DNA or RNA, refers to sequences 
that are free from any delivery vehicle that acts to assist, promote, or facilitate 
entry into the cell, including viral sequences, viral panicles, liposome 
formulations, lipofectin or precipitating agents and the like. However, the 

25 polynucleotides may also be delivered in liposome formulations (such as those 
taught in Feigner P.L. et al. (1995) Ann. NY Acad. Sci. 772: 126-139 and 
Abdaliah B. et al. (1995) Biol. Cell 85(1): 1-7) which can be prepared by 
methods well known to those skilled in the art. 

- The polynucleotide vector constructs of the present invention used in 

30 the gene therapy method are preferably constructs that will not integrate into the 
host genome nor will they contain sequences that allow, for replication. Any 
strong promoter known to those skilled in the art can be used for driving the 
expression of DNA. Unlike other gene therapies techniques, one major 
advantage of introducing naked nucleic acid sequences into target cells is the 

35 transitory nature of the polynucleotide synthesis in the cells. Studies have 
shown that non-replicating DNA sequences can be introduced into cells to 
provide production of the desired polypeptide for periods of up to six months 7 . 
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The polynucleotide construct of the present invention can be delivered to 
the interstitial space of tissues within the an animal, including of muscle, skin, 
brain, lung, liver, spleen, bone marrow, thymus, heart, lymph, blood, bone, 
cartilage, pancreas, kidney, gall bladder, stomach, intestine, testis, ovary, 
5 uterus, rectum, nervous system, eye, gland, and connective ussue. Interstidal 
space of the tissues comprises the intercellular fluid, mucopolysaccharide matrix 
among the reticular fibers of organ tissues, elastic fibers in the wails of vessels or 
chambers, collagen fibers of fibrous dssues, or that same matrix within 
connective tissue ensheathing muscle ceils or in the lacunae of bone. It is 

10 similarly the space occupied by the plasma of the circuladon and the lymph fluid 
of the lymphatic channels. Delivery to the. interstitial space of muscle tissue is 
preferred for the reasons discussed below. They may be conveniently delivered 
by injection into the tissues comprising these cells. They are preferably delivered 
to and expressed in persistent, non-dividing cells which are differentiated, 

15 although delivery and expression may be achieved in non-differentiated or less 
completely differentiated cells, such as, for example, stem cells of blood or skin 
fibroblasts. In vivo muscle cells are particularly competent in their ability to cake 
up and express polynucleotides. 

For the naked polynucleotide injection, an effective dosage amount of 

20 DNA or RNA will be in the range of from about 0.05 g/kg body weight to about 
50 mg/kg body weight. Preferably the dosage will be from about 0.005 mg/kg 
to about 20 mg/kg and more preferably from about 0.05 mg/kg to about 5 mg/kg. 
Of course, as che artisan ofprdinary skill will appreciate, this dosage will vary 
according to the tissue site of injection. The appropriate and effective dosage of 

25 nucleic acid sequence can readily be determined by those of ordinary s'tull in the 
art and may depend on che condition being treated and che route of 
administration. The preferred route of administration is by the parenteral route of . 
injection into the interstitial space of tissues. However, other parenteral routes 
may also be used, such as, inhalation of an aerosol formulation particularly for 

30 delivery to lungs or bronchial tissues, throat or mucous membranes of the nose. 
In addition, naked polynucleotide constructs can be delivered to arteries during 
angioplasty by the catheter used in the procedure. 

The dose response effects of injected polynucleotide in muscle in vivo is 
determined as follows. Suitable template DNA for producdon of mRNA coding 

35 for the polypeptide of the present invention is prepared in accordance with a 
standard recombinant DNA methodology. The template DNA, which may be 
either circular or linear, is either used as naked DNA or complexed with 
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liposomes. The quadriceps muscles.of mice are then injected with various 
amounts of the template DNA. 

Five co six week old female and male Balb/C mice are anesthetized by 
' intraperitoneal injection with 03 ml of 2.5% Avercin. A 1.5 cm incision is made 
5 on the anterior thigh, and the quadriceps muscle is directly visualized. The 

template DNA is injected in 0.1 mi of carrier in a 1 cc syringe through a 27 gauge 
needle over one minute, approximately 0.5 cm from the distal insertion site of the 
muscle into the knee and about 0.2 cm deep. A suture is placed over the 
injection site for future localizator and the skin is closed with stainless steel 
10 clips. - 

After an appropriate incubation time (e.g.; 7 days) muscle extracts are prepared 
by excising the entire quadriceps. Every fifth 15 um cross-section of the individual 
quadriceps muscles is histoc hemic ally stained for protein expression. A time course for 
protein expression may be done in a similar fashion except that quadriceps from 
15 different mice are harvested at different times. Persistence of DNA in muscle following 
injection may be determined by Southern blot analysis. after preparing total cellular DNA 
and HIRT supernatants from injected and control mice. ■ The results of the above 
experimentation in mice can be use to extrapolate proper dosages and other treatment 
parameters in humans and other animals using naked DNA of the present invention. 
20 It will be clear that the invention may be practiced otherwise than as particularly 

described in the foregoing description and examples. Numerous modifications and 
variations of the present invendon are possible in light of the above teachings and, 
therefore, are within the scope of the appended claims. 

The entire disclosure of each document cited (including patents, patent 
25 applications, journal articles, abstracts, laboratory manuals, books, or other 

disclosures) in the Background of the Invendon, Detailed Descripdon, and Examples is 
hereby incorporated herein by reference. 
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Sequence Listing 



(I) GENERAL IMFCPM3V.Tj.GN : 



(i) APPLICANT : Human Genome Sciences , 



(ii) TITLE OF INVENTION : 2Q7 Human Secreced Pre ire ins 



(iii) NUMBER OF SEQUENCES: S00 



( iv) CORRESPONDENCE ADDRESS : 



I A) ADDRESSEE 



: Human Genome Sciences, Inc 



(3) STREET: 9410 Key West: Avenue. 
(C) CITY: Rockr/ilia 
(t>) STATE: Maryland 
(E) COUNTRY: USA 
(?) 21?: 20350 

(v) COMPUTER READABLE FORM: 



(3) COMPUTER : HP Veccra 436/33 

(C) OPERATING SYSTEM : MSDOS version 6.2 

(D) SOFTWARE: ASCII Texc 

(vi) CURRENT APPLICATION DATA: 
(A) APPLICATION NUMEE?-: 
(3) FILING DATE : 
(C) CLASSIFICATION : 

(vii) PRIOR A? PLICATION DATA: 



(A) MEDIUM TYPE: Oiskecce, 



3.50 inch, 



I . 4Mb scorace 



.(A) APPLICATION NUMBER. : 



(3) FILING DA.TE: 
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(viii) ATTORNEY/ AGE3FT INFORMATION : 
(A) NAME : Kenley K. Hoover 
(3) RZGI3TRAT ION NUMBER: 40.302 
(C) EEFERENCE/COCKET NUMBER: PZ0Q7PCT 

(vi) TELECOMMUNICATION INFORMATION: 
(A) TELEPHONE: (301) 309-8504 
(3) TELEFAX: (301) 309-3439 

(2) INFORMATION FOR SEQ ID. NO: I: 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 73 3 base pairs 
(3) TYPE: nucleic acid 

(C) STRANDEDNE33 double 

(D) -TOPOLOGY: linear 

(Xi) SEQUENCE DESCRIPTION: SEQ IS NO: I : ' 
GGGATCCGGA GCCCAAATCT TCTGACAAAA CTCACACATG CCCACCGTGC . CCAGCACCTG 
AATTCGAGGG TGCACCGTCA GTCTTCCTCT TCCCCCCAAA ACCCAAGGAC ACCCTCATGA 
TCTCCCGGAC TCCTGAGGTC ACATGCGTGG TGGTGGACGT AAGCCACGAA GACCCTGAGG 
TCAAGTTCAA CTGGTACGTG GACGGCGTGG AGGTGCATAA TGCCAAGACA AAGCCGCGGG 
AGGAGCAGTA CAACAGCACG TACCGTGTGG TCAGCGTCCT CACCGTCCTG CACCAGGACT 
GGCTGAATGG CAAGGAGTAC AAGTGCAAGG TCTCCAACAA AGCCCTCCCA ACCCCCATCG 
AGAAAACCAT CTCCAAAGCC AAAGGGCAGC CCCGAGAACC ACAGGTGTAC ACCCTGCCCC 
CATCCCGGGA TGAGCTGACC AAGAACCAGG TCAGCCTGAC CTGCCTGGTC AAAGGCTTCT 
ATCCAAGCGA CATCGCCGTG GAGTGGGAGA GCAATGGGCA GCCGGAGAAC AACTACAAGA 
CCACGCCTCC CGTGCTGGAC TCCGACGGCT CCTTCTTCCT CTACAGCAAG CTCACCGTGG 
ACAAGAGCAG GTGGCAGCAG GGGAACGTCT TCTCATGCTC CGTGATGCAT GAGGCTCTGC 
ACAACCACTA CACGCPsGAAG AGCCTCTCCC TGTCTCCGGG TAAATGAGTG CGACGGCCGC 
GACTCTAGAG GAT • 

(2) INFORMATION FOR SEQ ID MO: 2: 



WO 98/54963 



267 



li) SEQu&NCE CHARACTER 1 ST ICS : 

(A) LENGTH: 5 amino acids 
(S) TYPE: amine acid 
'(0) TOPGLCGY: linear 

(JCi) SEQUENCE DESCRIPTION: SEQ ID MO: 2: 

Trp Ser Xaa Trp Ser 



(2) . INFORMATION .FOR SEQ ID NO: 3: 

(i) SEQUENCE C^A-?A.CTEEISTICS : 

(A) LENGTH: SS" base pairs 

(B) TYPE: nucleic acid 

( C ) STRAMDEDNE33 : double 

(D) TOPOLOGY: linear 

(Xi) SEQUENCE DESCRIPTION : SEQ ID MO: 3: 
GCGCCTCGAG ATTTCCCCGA AATCTAGATT TCCCCGAAAT GATTTCCCCG AAATGATTTC 
CCCGAAATAT CTGCCATCTC AATTAG 



(2) INFORMATION FOR SEQ ID NO: 4: 

(i) SEQUENCE C-IAilACTSRISTICS : 

,(A) LENGTH : 27 base pairs 
(3) TYPE : nucleic acid 

(C) STRANDEDNESS : double 

( D ) . TOPGLCGY : 1 mear 

(xi) SEQUENCE DESCRIPTION : SEQ ID MO: 4: 
GCGGCAAGCT TTTTGCAAAG CCTAGGC 



(2) INFORMATION FOR SEQ ID NO: 5: 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH : -271 base pairs 
(3) TYPE: nucleic acid 
CO STRANDEDNES 3 : double 
(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 5: 

CTCGAGATTT CCCCGAAATC TAGATTTCCC CGAAATGATT TCCCCGAAAT GATTTCCCCG 

AAAT ATCTOC CATCTCAATT AGTCAGCAAC CATAGTCCCG CCCCTAACTC CGCCCATCCC 
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GCCCCTAACT COGCCCAGTT CCGCCCATTC TCCGCCCCAT GGCTGACTAA TTTTTTTTAT 
TTATGCAGAG GCCGAGGCCG CCTCGGCCTC TGAGCTATTC CAGAAGTAGT GAGGAGGCTT 
TTTTGGAGGC CTAGGCTTTT GCAAA-AGCT T 



(2) INFORMATION FOR SEQ ID MO: a: 

(i) 3EQUEMCH CHARACTERISTICS: 

(A) LENGTH: 32 base pairs 

(B) 'TYPE : nucleic acid 

(C) STSANDEDHESS : double 

(D) TOPOLOGY: linear 

- (xi> SEQUENCE DESCRIPTION: SEQ ID NO: 6: 
GCGCTCGAGG GATGACAGCG ATAGAACCCC GG 



(2) INFORMATION FOR SEQ ID NO : 7 : 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 31 base pairs 
(3) TYPE: nucleic acid 

(C) STFANDEDNESS:' double 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 7: 



GCGAAGCTTC GCGACTCCCC GGATCCGCCT C 



(2) INFORMATION FOR SEQ ID NO: 3 : 

(i) SEQUENCE CHAFACTERI ST ICS : 

(A) LENGTH: 12 base pairs 
(3) TYPE: nucleic acid 

(C) STRANDEDNES 3 : double 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 3: 
GGGGACTTTC CC 



(2) INFORMATION FOR SEQ ID NO: 9: 



(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 73 base pairs 
(3) TYPE: nucleic acid 
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(C) 3TEANDEDNE35 : double 

(D) TOPOLOGY: linear 

(:ci) SEQUE2CE DESCRIPTION: SEQ ID MO: 9: 
GCGGCCTCGA GGGGACTTTC CCGGGGACTT TCCGGGGACT TTCCGGGACT TTCCATCCTG 
OEATCTC--AT TAG 

(2) INFQRMAT ION FOR SEQ ID NO: 10: 

(i) SEQUENCE CHARACTSRISTtCS : 

(A) LENGTH : 25c base pairs 
(3) TYPE: nucleic acid 

(C) 5TSAMDEDNE3S : double 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 10: 
CTCGAGGGGA CTTTCCCGGG GACTTTCCGG GGACTTTCCG GGACTTTCCA TCTGCCATCT 
CAATTAGTCA GCAACCATAG TCCCGCCCCT AACTCCGCCC ATCCCGCCCC TAACTCCGCC 
CAGTTCCGCC CATTCTCCGC CCCATGGCTg'* ACTAATTTTT TTTATTTATG CAGAGGCCGA 
GGCCGCCTCG GCCTCTGAGC TATTCCAGAA GTAGTGAGGA GGCTTTTTTG GAGGCCTAGG 
CTTTTGCAAA AAGCTT 

(2) INFORMATION FOR SEQ ID NO: IX: 

<i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 252c base pairs 
(3) TYPE: nucleic acid 
( C J STRAND EDME3 S : double 
- (D) TOPOLOGY: linear 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 11: 

GACAGGCTAT CCGAGAATCT GAGAGCTGGG CCCGGCAATT CCTCCAGT fTA CCCTTGTGAC 

CTAAGTCCAG TCACACATTT CCCAAAGTTT CTCTTTGTCA TAACC CTGGT CTGGCTGGTT 

TTGRGGRCTT GAGAATGGGT CAGGGACTCC AGGCCAAGTC CAACAGAGAC CCCAAACCCA 

CCACACACCA GCAGCCACAA CCTCACCACC AACAAAGAGG ACTTTTGTGG GGCCACAAGT 

AAGAGGTCAT TTCTGGAATG GACTCAGACC TTTAAACAGG AGAGTTGAGC ACTTCCAGK3 

AGTTTTTAAG CAAGGCATGG GGAACAGGGA ATAGAACCTT TCAAAGAGGT TGCCCAGAGA 

AAAGCTGGGC CTCTTGCATT CGGCTTCCTT GGAGCAGCCT CTTCTGGCAG AAAGCCATCA 

GGTGCTCAAT CATCTTCTCC TGGCCAAGGC TCTGAOCATG CTTAGTACTG GAATAGAGGT 



WO 98/54963 



PCT/US9S/U422 



270 

GGCCAGGCCC CCAGCGACTC TTCTTGCCCT GATGTTTGTC CTCACAGGCA TGCCACGTGG 540 
CCTGAGA.TGA TTCAGAACAA ATCATGCTAA CTTTGAATCC ATCCAGCCAC TTGCAAATGA 600 

5 

TAATCAGAAG TCAGCTTGTT CACIGTTAGA AAGAAACTAA CAAAAGAGAA CCCAGAGCAA. ooO 
TCTAGAATCT TTGAGTGCTT GGCTTTCCAA GGATACTGCG GAGACTCTGG CCAAGCTGAT 720 

10 GAMCTTCTGA ARTGTCACTG GCACCATATG CAAGAAGAAC CACCATTCAC TGAGTAGCTA 73G 
ATGGGTTTGG GGCCTGGGAC ATTCCATCTG AGGTCCTTCC TGAACATGTC ACTGCACftGC 340 
AGAGGACCGG TTGCAGCTTA CCCAGAACCA CTCCTCCAGG AGAGCTGGAT GTTTTGCGTG 900 
CAACACCTTG AGCACTGACT GCTATTGTTC AAAAAAAGCC TTTGCTGCAT TOGGAGGACT 960 

GCCCCGTGCC CTGAGGTGAC TTCCTAACTA TGTGGTTTCA TTAGCGAATT TATTTTTTGT L020- 

20 GCTGGGTGGA CATTTGTATT TTGTTAGGTT GCTGTTTAAG CTCAAGTTTG CTGTGCTCTC 1080 

TGCA.GCTACA AAACATCTTG GCATATTTAA GAXTGGCTTT TATAAATAGC TTTATTCTGA 1140 

TATTAATCA.G ATTCCCAACT TTACTGAGAA TTAAGGACTG GGGTACTTTA AAGAAATGCA 1200 

25 

AATAGCAATT GAAGAACCAC TGCTGCAGGT -fc-GTAGCCCTG GCTAGACTGA ATTACACTAG 1260 

AAA.TCAGCCA GAAGGAAGCG TCCTTGGGAT CCCAGATCAC TCTTTTTTTT TTTTTTTTTA 132Q 

30 AAAGGGGCAG CCCCTTGATG GCTCATCTC? CTGAATAACA GTTACGTCTT CATATCGATA 1330 

• CCAGATGCCT TCTTCATCAT GCCACTGAAG CCACTCACCA CCTTCAAGAA CA.TGCCAACC 1440 

TCTGTCAGAT TCACTTACCC ACAAACAAGG AGGCACGTTT GGCACAAAGT GTTGTCCTCC 1500 

35 

AGGTCCAAGT GGACTCTACA GAGTGCTTGA CCTCAACACA CTGGATTCCA GGTGGACTGG 15 60 

ACCAAGAGCA GGCAAAGACA CGGGAACTGA AAAACTCCAC AGGGTTTGGA GAATAGAAAT 1620 

40 GAAAAGCCAC GTCATATAAC TCAAGAATAA ATGGTGTTTT GGAAATTTTA AAATTATCAT 1630 

CGAAGGTGGT GAAACTATTT CAGGCCCAAA TGAAAGGAAA TCGCCAGTTG GGGA.TGAAAT 1740 
CACAGAGCCT GTGTTTTATG ATATGGTTGG ATGTCCACTG ATGAAATTTT AAAGGAGTTT ' 1300 

CATTTTTAAA AGTGCGCATG ATTCTACATA TGAGAATTCT TTAGGCCAAG AAACTGTCCT 1360 

TGGCTCAGAG GTGTTGGGAA TTAAAGCAGA GAGAAGCCAT TCGTGATGCT TAGAACCAAG 1920 

50 GATGGTCATG TACACAAAGA CCATCGAGAC GGCCATTCTT GTTTACAAAA CACTTACCAA 1980 

GAAAGCACTT TGTAGGGGAA CTTTAGTAAG TTCTTCTCAT TTCATTATGT TTCTTCCAAG 2040 

GAAACAC-GAG AGAGTGAATT AATAATTCTC TCTTTCCTCT TAAGCACTTT TAAAATAATA 2100 

55 

AAGTACATCT TGAAATTTGG GGGGGCATCT CTGATTTAAA AAAAGAAAAA. GGCTGCTTGA 2150 

TGTATGTTAT GCAGAGACAC TCTGCCTCTG GTGGCTGCAG AQGAATACCC AAGCCTCATT 2220 

60 TGGAAGGCTC AACATTTGGA ATTGCACTTT AATTGA.TTAA. TCCTCAATTC ATGTGGCCTT 2230 
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ACGGGATGGT GGGTCTGGGA CCCCAATTCA TTCTTATCTG CCAAAGAATT ATCTAGAAGC 2340 

ACATC-AA.TA. CCAGCACCCC ACCTGCACAA TGGGGGTGGA AAACTTTTGT ATC CCT AAGC 2400 

5 

ATATTATTTT ATAGTGTCTG CCATGCCATG TGGAAATACT TTATTTTTAA CCTCAGGATT 2460 

TAAATAAAGT AAACACTATG ACATTTAAAA AAAAAAAAAA AAAACTCGAG GGGGGCCCGG 2520 

10 TACCCA • 2526 



15 (2) INFO HHAT ION FOP. SEQ ID NO: 12:- 

(1) SEQUENCE CHAPACTERX3TXCS : 

"(A) LENGTH: 1131 base pairs 
(3) TYPE : nucleic acid 
20 -(C)- STRANDEDNSSS : double " 

(D) TGPQLCCTc: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 12: 

25' CACTGCACCA GCTTTGTTAT CTGTAAAATG ATGATAATAC CAACACCTTC TTCTTGGGGT 50 

ACTGAAGATG AGAGAACATG ATA.TGTGTAA 'AGTGCCTTCC AOiA.TA.CCC\ GAACATAGCA 120 

AACATGTAAT GAATGTAGTA ATAGTAATTA TTTTATTTTC TTTTGATTCA GTTGGGACTA 130 

30 

TGTTCAGCTG TAACAGAA.TA CCCAAAATAA CTGTTTTAAA CAAATTAAAG TTTWGTTGTG 240 

AAGTTTTGTT ACGAATTCAG ACAATCCAGG GCTTTTATAG ATGCACCAGG ATCAGCAGGT 300 

35 ACAAAGGCAT CTTTCCTGAT TTCTGCCAGT CTCAATGCAT GGGTTGCAAT CCAGARTCCA 3 60 

RGATGGCAGT TCCAGCCCTG GTTACGCC C A TATTAGCACA CAGAAAGAAA GAGAAAGGGA 420 

• TGTGCCTCTT CACTTTAATC ATAGCTCCCA CTAGATGCAC CCACTACTTC TGCTGATACT 430 

40 

•CCATTAGCTA ATGCTTGCTT ACATGGTCAC ACTTAGTTTC CAGAGAGACA TGTCTGGACA 540 

GTCATGTGCT CAATTAA.TAT CCAAGTGTCC AATTACTGAG AAAAAAAGAA ACTAGCACCT 500 

45 TTGCTTGGTT GCATTCCTCT TAGCATAAGC CA.CATTCTTT TTATGAAGTT GTCCTCAGTT 560 

ACTTGGATGC CTCAGTTGTC CTTTCAWTTA GAAAWGCjTCC TKGGACAYCC TGAAJWCTGAC 720 
TTCTTTTGTC ATCAGCACCA TCACTACCAC TGCCYTCTTC AAAGCCACCA CGTTCTGTCC • 730- 

50 

C'CAGGATGGT TGCAACAACC ACCATAGGGA CTTTTTGCCT . TCTACTTCCA CAGAA.TAGWC 340 

CAGAGTAAGC TTTTGAAAAT GTAGGTCAGA TCATGTCTCT CTCTTCCTCT TCAAAACCCT 500 

55 CCCGATGGCT TTTCATATTA CTCAAAAGAA AACCTAAAAC TTTGCTGTGA GATCTATGTG 960 

ACCCGGCTTA TTCTTCCTCT TACTTTATCT CTGTATTGCT CTTCCTCACT CTACTCCAGC 1020 

CATCCCACCT CCTTCCTGCT TGTCCTATAC TCCTAAAAGA AGTTCAGTCT TCCCTTATGA 1080 



60 
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TATTTGCACT TAAAATAGAA AA?AAAAAAA AAAAAAAACT CGMGGGGGC C 

(2.) XMFGEMATION. FOR SEQ ID NO: 12: 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: S4I base cairs 
(3) TYPE: nucleic acid 

(C) STRANDEDNE33 : double 

(D) TOPOLOGY: linear 

(:ci) SEQUEMCE DESCFSPTZQN : SEQ ID NO: 13: 
GGCACGAGTA GCATTTCATT TAATCTGCAG GTATATTCTC CCAACAGTTT ATTGTCATGT 
GATGTCCTCA GCCAAGATTG TRAGGCAGAG AGGAGCTGTC CCAACCTACT ATACCACCGA 
GGCTGGAGAG ATCATATTTT TGGTATTAAA GTGGAGTCTC TCCATCCTTC ACATTGTTGA 
TGTCCTCTGT AGCAAACCGG AAAAGTCAGT GACAGAAGAT GCCGCTAGCG GTTTGAGCCA 
GAGAATGACA GCTCTGGTTT GGAGAAAAGG GCCGGATGGT GGCTCTAGAA AGCCCATCCT 
TCTGCTCTTC TTTTTTCTCC CCCTTATATT GTGCTTTCAT TCATTCATTC ATTCATCAAA 
CATTTGTTGA GCACCTATTA TGTGTCAAGC TCTGTGCTAG CCTCTGGAAA ACCTGCCCTC 
ATGTAGCTCA CTGTGGAGTA GGAGAAACAA TG ACT ACACT ATGATAAGCA CGGGTTGTCA 
GGGTCTCACA GAGCAGTGGC CCCTCATCCA GACCGATGAG GTCAAAGAAG GCATCCAGGC 
GAGGATGGTG TCAGAGCTAA CTGAAGAATG AGAGGGAGCT GCACCASCAG GGGTTGGAAC 
TGAAGGTGGC AGTGCCTGGA GTCTTGATTC CAGCAGAGGG ■ AGAGC AGT CT GTGAAAAGGC 
ACCAAGGGTG GGAGAGGGCA GAGCACATGG AGGAACTTCA GGTAGTTCTG GATGGC SCTG 
GGGCAAAGCT AGAGAGGTAA GAAGAATCTA CAAATGTTCC TCGAGTTACA TGAACTTCCA 
TCCCAATAAA CCCATTGGAA -ACGAAAAATT TAAGTCAGAA. GTGCATTTAA GGCTGGTCCG 
AGTAGAATGA TTTTTACAAC GAATTGATCA CAACCAGTTA CAGATGTCTT TGTTCCTTCT 
CCACTCCCAC TGCTTCACCT GACTAGCCTT TAAAAAAAAA A 

(2) INFORMATION FOR SEQ ID NO : 14 : 

(i) SZQUEJ^CZ CHARACTERISTICS : 

(A) LENGTH: S43 base pairs 
(3) TYPE: nucleic acid 

(C) STRANDEENESS : double 

(D) TOPOLOGY : linear " 

(xi) SEQUENCE DESCRIPTION: SEQ ID MO: 14: 
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CCCTCAATTA AAGOGO^-.C CAA-AAGCTG by 

CTAGGAACTA GTGGAATCCC CCGGGGCTGC 120 

TTGTATGATA CTAJTTTCCAC AAWATGCVTT 130 

AATTcrrrrr aaatattccg tgaatttctt i40 

GACAGAGTTC TAAATATGCG CATTAGATTA 3Q0 

ATATCCTTAT TAATCTGTCA ATCTGTTCAA 3 50 

GTGTGACTTT GCCCTTAAAA. TTCTGATGAT 420 

AGGAAGAACA TTAAAGAAGT TTCCATTGGC 430 

TTTTTATTTT GGCTIMCTAAG C-JGCTATGAA 540 

cATTTcrrrc c^gattacct tgttagcatc ■ soo 

AAAGTA.TCAA AATCATATAG GT ATGATTTT SZQ 

TAGTCCAGCG AGTATTTGTT GATGGAAATT 720 

GTTTACGATC TTTGGGACAT TCTCATTATT 730 

TAKACAACCC ATCCGTTCTT TAAAAAAAAA. 340 

343 

■(2) INFORMATION FOR SEQ ID NO: 15: 

35 

(i) SEQUENCE CHAPAOTERISTIC3 : 

(A) LENGTH:, X013 base p&tzs 
(3) TYPE: nucleic acid 
' (C) STFANDEDNE3S : double 
40 (D) TOPOLOGY: linear 

Cci) SEQUENCE DESCRIPTION: SEQ ID NO: 15: 

i 

CTGTAATTTT TAATTTTCAT ATACCGTGCT TTGATTCTAA TTTTATTTTT TGAGTTCTCT SO 

45 

GAAGGTTACA TATACAGAGT , GCTTCAGGAA TGATO.TTTT GTTATTATTC ATGCTTCTTA 120 

ACAATGTTGT TTTA3TCCAA GAAGATAATT GCCAGAGAAA GAAT ACAGTG CAGGAAAGAA 130 

50 GARGCTGGAG CCAGTGGTGA AGARGGATTG AGAFGACAGA CATTGTGGGA ATGAAA.TCAT 240 

GAATAATCGT GTTTTTGAAT TGTCCAAAAA CTTCTACAAA CCATGAAATG TTGGAGTTTA 300 

AATCTAATTG TTGAAAAATT CCCCACATTC CTTGTATCCC TTAGGTTGAG C\TAATTCCA 360 

55 

CATCCGTGGA CTGATGOGT TCC CAAGAGG GGGCCTCATT AACTCTTCCG AGGCAGCAGC 420 

AGCAAGGGCA CCCCCTCCTT TCCCCCCACA CCCCAYTTCT CATGGCTCTT CTTTCTCTCA 430 

60 TCTCATGCTT AGGTTAGPAA AGGGCACAAG GTAAGGAAGC CCTTGGGAAT AGGCTGAATC 540 



OiAGGGATAA CCCC-AAGeiT CGGAAATAAA 
GGAA.GTTCCC CGCOGCGGTG G03GC«CNGMT 
5 AGGGAATTCG GCACGGAGTG GGAATC-TTGT 
GAGACTTGGT KTGTGGCGTA GGAGATGGTC 
TAGTGCATAT TCTGCGATGG GGGCTGTGGG 

10 

AATCTCTTCA TTCTGTTGCT C\CATCTTCT 
GAGAGGTGTT ATTAAAA.TCT CTCACTGTAT 
15 TTGCTTTATA AATGGTTATA ACCATTTTCC 
ATTATCCAGT TTCCCTCAAA ATACTGGTTT 
TCGAGTTTCT CAGAAGCCCT TGTCTCAAGG 

20 

CACACTATGG GGTATTTTAG AAAAACAAAA 
CCTGTGCTTG AAGGAGGCTT AAAGCTCATC 
25 CTGCCAAGAA ATCTCTATTG TChAGATATT 
AGAAACAAAT CCTAAGAAGA AATTC TGCC A 
AAA. 

30 
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TGGCTATCTA ATTTGGTGCC AAATACTT.-A TGTGCTTGAA TTTAAAAACA GC-AACATGT 
AGAAAGGTAA TTATAATTAT GAGGC CAGTT CTTTAAGCTA GCTTTTTTTC CCCTCTCAAA 
CAGCATATTG GCTTGGATGT CAGCAGGAGA AAGTGTTTTT TGCAATACAC ATAATGCATA 
TATGGTCCTG TTAGCAATCT ATAGAAAATA GATATTGCTC ATTAAGGTAA ATA T T TTT GT 
TGATGAATGA TCTGGAATGG TCTGGACTTG TTGTGTGAAC AGGAAATTGC TCTGTAGGCT 
TTGACTTGTG AGGTAAAGAG TGAGGCTGGT AAGATTAATT AAAGTAAATA CTGTGACAAT 
AGGATGTCAA AAC CAAAAAC GTGTTTGTGA AACTCAAGGA ATTAATGACA CATAGGGAAG 
TTTTTGCCAT ATTAAGCATA GAGTAGGAGA GGCAAGTCAA GAATAAAAAA AAAAAAAA 

(2) INFORMATION FOR SEQ ID NO: 16: 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: Sol base pairs 
(3) TYPE: nucleic acid 

(C) STRANDEDNE3 3 : double 

(D) TOPQuCGY: linear 

(xi) SEQUENCE DESCRIPTION:. SZQ ID NO: 15: 
TTTAAGAAAT TAGTGAATCC CCGGNTGCAG GGAATTCGGC ACGAGGAGGA GGCCGTCAGC 
TGGCAGGAGC GCAGGATGGC AGCTGYTCCC CCGGGTTGCA CCCCCCCAGY TCTGCTGGAC 
ATAAGVTGGT TAACAGAGAG CCTGGGAGCT GGGCAGCCTG TACCTGTGGA GTGCCGGCAC 
CGCCTGGAGG TGGCTGGGCC AAGQAAGGGG CCTCTGAGCC CAGCATGGAT GCCTGCCTAT 
GCCTGG CAGC GCCCTACGCC CCTCACACAC CACAACACTG GCCTMTCCGA GCTGCTGGAG 
CATGGAGTGT GTGAGGAGGT GGAGAGAGTT CGGCGCTCAG AGAGGTACCA GACCATGAAG 
GTGCGCAGGG CAGGGCTCGG ACCTACCCCA GGAATGTCCT GCCCTGGGA* TGACAACACA 
GTCCACACCA TGCACGGGGA GGCAAACASG GGCAGCTGAC CCAGCCCAGG GGTCAGANGA 
GGTCTTGCCG AGGAAGTGGC AGCTAAGCTG ATACCTGATA TCC^CWAGXC AGCCAP.GYGG 
AGACAGGCAA GGAAGAAGCT TGTTTTGAGG ACAGAATTTT CTAGATCACT CAGCACCATC 
TGGCTTTTGG GGCTTTTTGT TTTATTTTGT TTTTGAGACG GGGTCTCGCT CTGTCGCCCA 
N 

(2) INFORMATION FOR SEQ ID MO: 17: 

(i) SEQUENCE CHARACTERISTICS:' 
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(A) LZMGTH : 553 bass pairs 

(B) TYPE: nucleic acid 

(C) STSANDEDNESS : double 

(D) TOPOLOGY: linear 

(xi) SZQUZMCE DESCRISTICN: "SEQ ID NO : 17: 
GGGACAGGGC TATTTGCCCC TCTCTCCACA TGACAGAA.CT GCTCTAAGTT TCTTTGCTGC 
TCTTCTCAGC TGTCAGA.CGG CTTGCTGCTT GTTTTCCACA CCA.CCATGTC TATTCTTTGC 
TGTCCT' ITWAC TCTGCCTGTT TTTTTCCTTT TGTATTTCTT CTGGCTCTTG TCCCTTTTCC 
CACGTGTCWC AGCTTT CCTT TATTGCCACT TTCAGTCAGA GCAGTCCTGT GCTTCTGGTG 
CCGGCATACA ATACTTACTT GAGTTTCTTG GCTTTTCTTG ACTGTGCATC TCTTACTTCA 
ACATAGGAAT AGCCTGTCAT AGAATTTCTC CAGTTCCAGG GCTCAAGAGG GA.GAGTGCCA 
GAAAATTGAG ACTGTTTTCG CTGTCTTGGA TTGAATTCAT AAAGCAAAAC CAGTGTTTGT 
GTGAGGGTTT GCTGTGTCAT GCCTATAGGT TGTTTGGGTG CAAACCTATA GAATCCAGCC 
TGCGAAAAGA AAGEAACCAG AGAATANCAG CATCAGAACA ATGCTTGACA TCATTTCTCA 
ATCAAGCAGT CCA 

• 

(2) IMFORMAT ION FOR SEQ ID NO : 13 : 

(i) SEQUENCE CHRRACTEF-ISTICS : 

(A) LENGTH : 359 base pairs 

(B) TYPE: nucleic acid 

(C) STFAAJDEDKESS : double 

(D) TOPOLOGY: linear 

- (:ci) SZ.QUZZICE DESCRIPTION: SEQ ID NO: 13: 
GGCACGAGCT GCCAACACTG AGGTCTTCGT GGCTTCTCAC ATCTAGATGT ATCCCTCTCA 
AATCTATCCT CTATCCAGGC ACCAGATTGA GGTATCTAAA ATGTCAACTT TC CAGTTACT 
CCTTCTTATA CTAGCCCAAT CAA.CTTACAA GATAAAGTCC- AAGCCCCTTC ATATGA.CAAA 
CCACACCCTG CTTAACTCTC CAGGTTTGAA TCCTTCATCT CCTACTTTAA ACTTTAAAAC 
CCAGCAGCAC GAAAGTGTCT CCTATGCATG TTGCCATATG CGTTCTCTCC ATCATGCATT 
TGCCTGAGCA AGATGTCTTG AGTTAACATC TTATTCTTTA AGA.CTCATTG TGGTGGTAGA 
CAGCCTTTAA TAACGGATCC TTGGCCAGGC AC ACT G ACT C ACAC CTGTAA TCCCAGAACT 
TTGAAAGGCC AAAGAAGGAA GAAAGCTTGA GGCCAGTAGT TTGAGACCAG CCTGGGAAAC 
AGAGAGATAT CCCATCTGT A CCAAAAATTT AAAAAAATAT TAGCAGGGAG TAGTGGCATG 
C-.CAAGTGGT CCCAGCTCCA TGGGAGASTG AGGTAGGAAC ATCACTTGAG CCCAGGAAGT 
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CAAGGCTGCA GTGAACCATG ATCAGAACAT TGCANTCCAG CTTGGGTAAC AGAGTGA3AC 
CTTAGGTCAG AAAAA.TGAAT AnATAAGCAT AAAATTTTAA. AAACTTAGC C AOGCATGGTG 
GCACACATCT GTGGTCCCTG CTACTTAGGA GGCTGAGGTG agaggatcct TGAGCCCAGG 
AGGTCAACAC TACAGTGAGG TATGATTGTG CCACTAAACT CCAACCTGGG TGAAAAAGCA 
AAAGGGTGGC AAAAAAAAAA AAAAAAACT 

(2) LMFORMAT ION FOR SEQ ID NO: 19: 

(i) SEQUENCE C-iAPACTEP-ISTICS : 

(A) LENGTH: 959 base pairs 
(3) TYPE: nucleic, acid 

(C) STRANDEDMES3 : double 

(D) TOPOLOGY : linear 

.(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 19: 
GGCGAGCCGA GATCGTGCCA TTGCACTCCA GCCTGGGCAA CAAGAGTGAA ACTCTGTCTC 
AAAAAAAAAA AATTATAATA CTATATGCCA TAAAATGACA TTTCATATTT AAAGAGTTTT 
TTAAAACTCT TGTATTCACA TGCCATAATT TGAAACCCTA TTTCACTGAA TGAGAATGGT 
ATCTGTTGTC CTCATTTTTT CATTTTTA.TC CTTAACAATT TCCACCACAG CCAGTGCATA. 
TAA.TGGCAAT GACACCCAGG GATGGAATGA TAAGTTCCAT CRCIIGCTCAG TCAAGACGCA 
GACTTGATGT GGCCCCAACA ACAGTCAATA ATGGAGTCTC CLA.AAATAAAG CTCTATAGGA 
AAGGTAAATA CCCGCTGCAC AAGAAACCAC AGCATCTAGG TTCTAACCCC ATCTCTATGA 
AGAGCTTGCT GGGAGAGTTT TGACATTWAA CAATCTGTCT GATKGCCAAT TTTYTTCTTC 
TATAAAA.TGA TAATGTTKGA YTCAAAGATC CAAAGTC-A.T TCATGGTCTA AAACTTAATG 
ATTTTTTTAG GTTTTGKGAC ATTTCACTGT ACACTGTAGT AATTTATATC TTATTTTCCC 
ACTAATTTAG AAAAATATYT AAATGATCCT TAATTGGCAA. TGGGTCCTAA GAATTTTGTT 
TTAAATCCCT GTTAGCCAAA AGAGCCCTTT TTTGTATCTC GCAGTAGTTA CAAGGATCTT 
TCTAAATCTT AAAAAAAAAA AAAAAAGAAA GAAAGAAAAG AAAAGAAAAA. AAGTCAGCCG 
GGCGTGGTGG CTCATGCCTG TAATCCCAGC ACTTTGGGAC CAAGGTGGAC AGATCACGAG 
GTCAGGAGAT GGAGACCATC CCGGCCAACA TGGAGAAACC CTGTCTCTAC TAAAPAAAAA 
AArVAACTCGA GGGGGGGCCG GTACCCAATM CGGCGGCTAG TGGTCGTAAA ACAATCAAA 



(2) INFOFMATION FOR SEQ ID NO: 20: 
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Li i 



(i) SEQUENCE C-l^ACTEF-ISTICS : 

(A) LENGTH : 1446 bas* pairs 
(5) TYPE: nucleic acid 

(C) STPANDECNESS : double 

(D) TOPOLOGY ■ linear 

(xi) SEQUENCE DESCRIPTION : 3EQ ID NO: 20: 
CGGGGCAGGG CTGTGTGGGA CCGCCAGGGA GCGGGCCCAC CTGAGTCACT TTATTGGGTT • 60 

CAGTCAACAC TTTCTTGCTC CCTGTTTTCT CTTCTGTGGG ATGATCTCAG ATGCAGGGGC 120 

TGGTTTTGGG GTTTTCCTGC TTGTGCCAAG GC-CTGGACAC TGCTGGGGGG CTC-GAAA.GCC 130 
15 CCTCCCTTCC TGTCCTTCTG TGGCCTCCAT CCCCTCATGG GTGCTGCCAT - CCTTCCTGGA ' 240 

GA.GAGGGAGG TGAAAGCTGG TGTGAGCCCA GTGGGTTCCC GCCCACTCAC CCAGGAGCTG 300 

GCTGGGCCAG GACOGGGAGA GGG^J3CACTG CTGCCCTCCT GGCCCTGCTC CTTCCGCAGT 360 

20 ' 

TAGGGGTGGA CCGAGCCTCG CTTTCCCCAC TGTTCTGGAG GGAAGGGGAA GGAGCCQGTC 420 

TTCAGGCTGG AGCCAGGCTG GGGGTGCTGG GTGGA-GAGAT GAGATTTAGG GGGTGCCTCA 430 

25 TGGGGTGGC-C AGGCCTGGGG TGAAATHA.GA AAGGCCCAGA ACGTGCAGGT CTGCGGAGGG 540 

GAAGTGTCCT GAGTGAAGGA GGGGACCCCC ATCCTGGQGG ATGCTGGGAG TGAGTGAGTG 600 

' AGATGGCTGA GTGAGGGTTA TGGGGAGCCT GAGGTTTTAT GGGCCTGTGT ATCCCCTTCT '660 

CCCGGCCCCA GCCTGCCTCC CTCCTGCCCG CCTGGCCCAC AGGTCTCCCT CTGGTCC CTG 720 



30 



TCCCTCTGGT GGTTGGGGAT GGAGCGGCAG CAAGGGGTGT AATGGGGCTG GGTTCTGTCT 730 

35 TCTACAGGCC ACCCCGAGGT CCTCAGTGGT TGCCTGGGGA GCCGGACGGG GCTCCTGAGG 340 

GGTACAGGTT GCGTTGGGCCC TCCCTGACGG TCTGGGGTCA GGCTTTGGCT CTGCTGCCTC 900 ' 

TCAGTCACCA AGTCACCTCC CTCTGAAAAT CCAGTCCCTT CTTTGGATGT CCTTGTGAGT 960 

40 

CACTCTGGGC CTGGCTGTCG TCCCTCCTCA GCTTCTTGTT CCTGGGACAA GGGTCAAGCC 1020 

AGGATGGGCC CAGGCCTGGG ATCCCCCACC CCAGGACCCC CAGGCCCCC? CCCCTGCTGC 1080 

45 ' TTTGCGGGGG GCAGGGCAGA AATGGACTCC TTTTGGGTCC CCGAGGTGGG GTCCCCTCCC 1140 

AGCCCTGCAT CCTCCGTGCC STAGACCTGC TCCCCAGAGG AGGGGCCTTG ACCCACAGGA 1200 
CGTGTGGTGG CGCCTGGCAC TCAGGGACCC CCAGCTGCCC CAGCCCTGGT CTCTGGCGCA : 1260 

50 

TCTCTTCGCT CTTGTCCGGA AGATCTGCGC CTCTAGTGCC TTTTGAGGGG TTCCCATCAT 1320 

CCCTCCCTGA TATTGTATTG AAAATA.TTAT GCACACTGTT CATGCTTCTA CT^ATCAATA 1330 

-55 'AACGCTTTAT TTAAAGCCAA AAAAAAAAAA AAAAAACTCG ?^GOGOGCC CGTACCCAAT 1440' 

TCGCCA 144S 

60 
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(2) IMFGHMAT ION FOR SEQ ID MO: 21: 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 1471 bass pairs 
(2) TYPE : nucleic acid 

( D ) TO POLCGY : 1 inear 

<xi) SEQUENCE DESCRIPTION: SEQ ID MO : 21: 

CAAAAAATAA TAATGATAAT TTAAAJVTAAA TAAGTAACTA ATAAAAAGAT 'TTTATATCCC 

AGTCTTATGA TGTTGGTTGG CAAGGCTAGA TAAAAAGATG TTAGAATGAA AGAACATATT • 

TTTAGTGATA TGTAAATGAA GGATTCTACA ATAGTCATAT ATTTTTATAT GAA.TGAATGT 

TGGGTTGGGC TGGAGAGGTA TGTGTGTGTA AATATAAAGG TCTCACATTC AGAGTATAGC 

TCTGAAATAA TGGAACTCAT GTCTACAATT CAACATGCAT CTGTATAGTT ACATCTCATG 

TAAATATACA CAGACATATT TTGCAGCCAG TAATTGACAG TTAATGTCCA AAACAGGTGA 

TTGATAGGTA ACAGAAATTA GATAACCACC AATTTTGCCC AAGAGAAAGA CTAGAAGGAC 

TAAAAGCAGT TGAATGTATG GTACTGAGAT TGTCATAAGC AGTCTGATAA G CAGTTT ATT 

. GAAACGTGTG CATTAACAGA GAATTTAATT TTAAACCCAT AATTTCTGCT ATCCATTAAA 

ATATTATAAT . TGTTAGTAGT ATGAAA.CCAA CAGGAAATGT TTTTTAATCA TTTAGTGAGG 

TGATTCATTT GTTTCATGGG CAAACACTAT CCAGGAAAAG CCTTGCTTGC CTGTTTCCCA 

AAGAGCTCTA AGAAATAGAA TCAAGTGTAA AATGGTTCAG ACCATTCAGG ATTTCTTGTC 

ACTCTTCTCA AGCCCGATCT TCCTGTTATT ACTGATGTTT GAAACCCTGT -CATTAGCCCC 

GGCCTGGTTA AAGCCCCTCA GAGTCACCTC TCATTCATAG CAATAGAATT CAACCCCAAG 

TGGTTGATGG TGTCCGCAGG ACAGC CGAG A GACCTGATCT CTGGA'TTCAG TGCTTTTAGC 

TCTTCGAGTT TACCCTAAGA TACCTT CGGG CAATATTTTT AACGAACCCA AAAGCTCTTC 

AGGTCATTTC. TGAAGAGGAC AAGGTGAAT C TTGGCTTGGA ACACCATTTT TGGGCTCTTG 

CTACTGAATG AATCAGAAAG GAATTTTTTC TGAAGAGGAT TAGAAAGTAA AGGAGATGTT 

AAAA.TAAGTT CTTGAAGTAT GTTTTATATT TATCTAAAAC ACTGATTTTA AAA.GTTTACA 

TTCAAATGTG TATTCAAAAG AAGTACTGAT TTGTAATTAT TATAGTTTGT GTGTATCATC 

CCCTTTTAAC CGTGCCTAAC AACTGTACTT AAATTTTGTT TTCCTAGTGT AA.CAAATGTT 

TCCCATAAGA TTTTCTAGAG CGAAATAA.TG GGAGTGAAAA ATTCCTTAAG TGTTATATAA 

GAAAATATAT TAGPAAATCA GCTTTGGATT ATACGATTTG TAAAATATAC TAATACAGAA 

TCCTCAGTAA TATGTTTTGA ATTGGATTTT TTCTCAGAAC TGTTACATAA TAAATAATAC 

ATGAACCAGA AAAAAAAAAA AAAAAAATTM C 
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5 (2) INFORMATION FOR SZQ ID NO: 22: 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH : 1402 bass pairs 
(3) TYPE: nucleic acid 
10 (C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID MO : 22: 

15 AGGGACGTCT TGCCTGAGGA GATGCCCATT TCTGTCCTGG RTTACCCTCA CTGCGTGGTG 50 - 

CATGAGCTGC CAGAGCTGAC GCCGGAGAGT TTGGAAGCAG GTGACAGTAA CCAATTTTGC 120 

TGGAGGAACC TCTTTTCTTG TATCAATCTG CTTCGGATCT TGAACAAGCT GACAAAGTGG 130 

20 

AAGCATTCAA GGACAATGAT GCTGGTGGTG TTCAACTCAG CCCCCATCTT GAAGCGGGCC 24Q 

CTAAAGGTGA A^CAAGCCAT GATGCAGCTC TATGTGCTGA AGCTGCTCAA GGTACAGACC 300 

25 AAATACTTGG GGCGGCAGTG GCGAAAGAGC AACATGAAGA CCATGTCTGC CATCTACCAG 3 SO 

AAGGTGCGGC ATGGGCTGAA CGACGACTGG GCATACGGCA ATGATCTTGA TGCCCGGCCT 420 

TGGGACTTCC AGGCAGAGGA GTGTGCCCTT' CGTGCCAACA TTGAACGCTT CAACGCCCGG 430 

30 

CGCTATGACC GGGCCCACAG CAACCCTGAC TTCCTGCCAG TGGACAACTG CCTGCAGAGT 540 

GTCCTGGGCC AACGGGTGGA CCTCCCTGAG GACTTTCAGA TGAA2TATGA CCTCTGGTTA 600 

35 GAAAGGGAGG TCTTCTCCAA GCCCATTTCC TGGGAAGAGC TGCTGCAGTG AGGCTGTTGG . 6o0 

TTAGGGGACT GAAATGGAGA GAAAAGATGA TCTGAAGGTA CCTGTGGGAC TGTCCTAGTT 720 

CATTGCTGCA GTGCTCCCAT CCCCCACCAG GTGGCAGCAC AGCCCCACTG TGTCTTCCGC 780 

40 

AGTCTGTCCT GGGCTTGGGT GAGCCCAGCT TGACCTCCCC TTGGTTCCCA GGGTCCTGCT 340 

CCGAAGCAGT CATCTCTGCC TGAGATCCAT TCTTCCTTTA MTTCCCCCAM CCTCCTCTCT 900 

45 TGGATATGGT TGGTTTTGGC TCATTTCACA ATCAGCCCAA GGOTGGGAAA GCTGG AATGG 960 

GATGGGAACC CCTCCGCCGT GCATCTRAAT TTCAGGGGTC ATGCTGATGC CTCTCGAGAC 1020 

ATACAAATCC TTGCCTTTGT CAGCTTGCAA AGGAGGAGAG TTTAGGATTA GGGCCZ.GGGC 10S0 

CAGAAACTCG GTATCTTGGT TGTGCTCTGG GGTGGGGGTG GGGTGTTTCT GATGTTATTC 1140 

CAGCCTCCTG CTACATTATA TCCAGAAGTA ATTGCGGAGG CTCCTTCAGC TGC CTCAGCA 1200 

55 CTTTGATTTT GGACAGGGAC AAGGTAGGAA GAGAAGCTTC CCTTAACCAG AGGGGCCATT 1250 . 

TTTCCTTTTG GCTTTCGAGG GCCTGTAAAT ATCTATATAT AATTCTGTGT GTATTCTGTG 1320 

TCATGTTGGG GTTTTTAATG TGATTGTGTA TTCTGTTTAC ATTAAAAAGA" .-JGCAAAAATA 138-0 

60 
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ATAAAAAAAA AAAAAAAAAA CT 



1402 



10 



15 



20 



20 



30 



35 



45 



50 



(2) INFORMATION FOP. SEQ ID* MO: 22: 

(i) SEQUENCE CHARaCTERISTICS : 

(A) LENGTH : 1047 base pairs 
(3) TYPE: nucleic acid 

(C) STRANDED2JE3S : double 

(D) TOPOLOGY : linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:- 23: 

GGCA.CAGGGG ACTAGAGGCA CCCACGACCA TACCCAGCTA ATTTTTGTAT TTTT T TGTAG SO 

AGATGGGGTT TCACGATGTC GCCCAGGCTG GTCTTGAACT CCTGGGCTTG AGCGATCTTC 120 

CeATCTTTCC ATCTTGGCCT CCTAAAGTGC TGGGACTGCA GGCATGAGCC ACCATGCCCA 130 

GCCAAGATTC TTATTGATTA CCATGTTGCT TCAAGAAGCC AAGCCAGTTT CCAATATTCC 240 

GAGTCTTGGT ACTTTGGGT A *" GAAGCAACTG GTAAATTGTT AATTGGAACA." 300 

GGTG TAGATAACCA CGTATGGCCA AACCTAGAGC ATCTAGGCTC ACAATTACTA 350 

TCCTGACTTG ATAACAAGTG TTCTGATATT AACCTGAAAA TGGGAATAA.T GCCAAATCTG 420 

TGTAACTTAA CATCTA.TATA CACAGTGGGG AGAACTGAAG TTATTAAACC TGGAATCTCT 430 

GTGATCAAGG CTAACAGTAG TTATCTAAGA AGCAAAGGAC CTACAATTCT TAGACTTGGA 540 

GTCATATTCT TTAAGGACGT GTTCTGAAAC TATATCAAGC ATCTGGTTTC CACGTATTTC 500 

TCCGTCAGAA A.TTATGAAGT ACAAGTAAAA ATGAAGGTAC AGGGTAAGAC ACATGCTGCT - GS0 

TTCTTCCTCT TGAGTGGAGA CAGTTTTCCA GCCATCTTAA CCCCTTWACA CAAAACAATT 720 

TGTGTTTTAT AGCAAATAAG TGACTCAACA TAATTTCAA.T ATGATGTTTA TGCACCAGTA. 730 

CTTTCCTTTC AGCTTCTAGT CCCATAARTG - G TTTGTGAAG TCATCGGTTA CATTAGCCAA 34 Q 

GATAGGCCTA GACTTGAAGT CTAGAATGTT TTTCCCACTA TA.TGCCAAAG TAGAATGTGG 900 

GTATCTCAGG GTCATTTTTG TTGTTCAATT TCCCACCTGT ACAGTTGTTA TGATTCACTT 960 

TCCTTATGTG TGTAATAAAT CTTGTTCCAT GAAATGATCA AAAAAAAAAA AAAAAAAACT 1020 

CGAGGGGGGG CCCGGTACCC AAATCGC 1047 



55 



60 



(2) INFORMATION FOR SEQ ID NO : 24: 

(i) SEQUENCE CHA?A.CTE?-ISTICS : 

(A) LENGTH: 990 base paiz 
(3) TY?E: nucleic acid 
( C ) STFANE EDMES 5: cc Lib 1 e 
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<D) TOPOLOGY: linear 
(;ci) SZQUEMCZ DHSCPJTPTXQM : "SEQ ID MO: 24: 

TTGGAAAGGG TCTAGCTCTT TCTCA.TTCAC OACTATATT AGAAGCACTT GAGGGAAATT 60 

TACCACTCCA AATCCAAAGC AATGAACAGT CTTTTCTGGA TGATTTTA.TT GCCTGTGTCC 120 

CAGGATCAAG TC-GTGGAAGG CTTGCAAGGT GGCTTCAGGC AGATTCATAT GCGGATCCTC 130 

AGAAAACATC TTTGATCCTG GAATAAGGAT GATATTCGTT GTGGTTGGCC TACCACCATA 240 

ACTGTTCAAA CAAAAGACCA GTAXGGGGAT GTGGTAGATG TTCCCAATA.T GAAGGTAATT 300 

15 ATAACTGGAT TAAATTAGCA GACATCTATA TACTGGCTGC AATGACTGAT AAAATTTTAG 3 SO 

AAATGCCAAG TGCTGAGRGT CCATTTGTTC TACCCTCTTT ATATAAAGGG TGATGCTGAA 420 

AGTTTGTTTA AATGACTTGT TTATATTAAT TAGTCCCCAA GTGTCCAAGT TACACCTGTT 430 

20 

' TTTTTTGTGA GTTTGTTCTT TACATTTTGG TAGCTGTTA.C GGGGACTCAA AGGAGGGATA 340 

AGAAAGTATG CATCTAAAGA GTGGTAGACA CATACAGTGA AGGCCCTGAA • TATGTATTGA 500 

25 TTGAATAAAT GCATGAAAGA~ ATACA.TTTTT AAATTTTGTG TATAGTTTTG AAAGACTGAA 560 

GTACGTTCTG TGTTTGGTAT TACTGAAAGG AGATTTTAAA AATAACACTC ATTAAGTTAG 720 

AAATATATGA GTTTAGATTG TAAAAGAATG AGGAATTGAA. ATAGTTGTAT ACCATATTGA 780 

30 

• TGAATATAGA GTTTTTAGGA TACCTCTTAC GTGAAATATT AA.TAATAATG TTTNCAGAGC 340 

ATATTATACA TAATTATTTG TGATTTAATC TGTTAATATG AAT ATCTC AT TTAAAACTTT 900 

35 TATTTCTGAA AAAATTATAT TGAATAAAAT TTTATATAGG CAGTCCCCAG GGCTTTCCTC 960 

CTTCAAAGTT GTCTTATAGA GTGATTGGTT • 990 



(2), -INFORMATION FOR SEQ ID NO: 25: 



40 



(1} SEQUENCE CHARACTERISTICS : 
45 (A) LENGTH: 1208 base pairs 

(3) TY?£:, nucleic acid 

(D) TOPOLOGY: linear 

50 (:<i) SEQUENCE DESCRIPTION: SEQ ID NO : 25: 

TAATCGCTAC TATAGGGAAA GCTGGTCGCT GCAGGTACCG GTCCGGAATT CCGGGTCGAC 60 

CCACGCGTCC GAGCGAAATG GCGCCTCCGG CCCCCGGCCC GGCCTCCGGG GGCTCCGGGG 120 

55 

AGGTAGA.CGA GCTGTTCGAC GTAAAGAACG CCTTCTACAT CC-GC=-.GCTAC CAGCAGTGCA 130 
TAAACGAGGC GCASGGGTGA AGCTRTCAAG CCCAGAGAjGA GACGTGGAGA GGGACGTCTT 240 
60 CGTGTATAGA GCGTACCTGG CGCA.GAGCAA GTTCGGTGTG GTC CTGGATG AGATCAAGGC 300 
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CTCCTCGGCC CCTGAGCTCC AGGCCGTGCG .CATGTTTGCT GACTACCTCG CCCACGAGAG ' 360 

TCGGAGGGAC AGCATCGTGG CCGAGCTGGA CCGAGAGATG AGCAGGnGCK TGGACGTGAC 420 

5 

CAACACCACC TTCCTGCTCA TGGCCGCCTC CATCTATCTC CACGACCACA ACGOGGATGC 430 

CGCCCTGCGT GCGC?C<L\CC AGGGGGACAG CGTGGAGTGG ACAGCGATGA CMTGCAGAT 540 

10 CGTGCTGAAG CTGGAGCGCG TGGACCTCGC CGGGAAGGAG CTGAAGAGAA TGCAGGACGT 500 

GGACGA3GAT GCGACGCTCA CCCAGCTCGC CACTGCCTGG GTCAGGGTGG CCACGGGTGG 660 

TGAGAAGGTG CAGGATGGGT ACTACATCTT CGAGGAGATG GCTGACAAGT GCTCGCCCAC 72 Q 

15 

CCTGCTGCTG CTCAATGGGC AGGCGGCCTG CCACATGCCC CAGGGCCGCT GGGAGGCCGC 730 

TGAGGGCCTG CTGCAGGAGG CGCTAGACAA GGATAGTGGG TACCCP.GAGA GGCTGGTCAA 340 

20 CCTCATCGTC CTGTCCCAGC ACGTKGGCAA GCCCCCTGAG GTGACAAAGG GATACCTOTC 900 

CCAGCTGAAG GATGCCCACA GGTCCCATCC CTTCATCAAG GAGTACOGG CCAAGGAGAA 960 
CGACTTTGAC AGGCTGGTGC TACA3TACGC TCCCA.GGGCT GAGGGTGGGG GAGAGGTGTG ■ 1020 

25 

AGGACGATGA AGCCAGGACA GAGGGCAGGA GCCAGCCCTG CAGCCCTCCC CACCCGGCAT 1080 

CCAC'GTGCAT CCCTCTGGGG CAGGAGCCCA CGCGGAGGAC CCCCATCTGT TAMAAATAT 1140 

30 CTCAACTCCA RGGTGTTCCA G CTGAAAAAA AAAAAAPAAA AAAAMAAAA AAAAAAAAAA 1200 

AAAAAAAA 1203 



35 • 

• (2) INFORMATION FOR SEQ ID NO: 26: 

(i) SEQUENCE CKAPACTERI ST ICS : 
40 (A) LENGTH: 1922 base pairs 

(3) TYPE: nucleic acid 

(C) STPAMDEDNESS : double 

(D) TOPOLOGY: linear 

45 (Xi) SEQUENCE DESCF-IPTION : SEQ ID MO: 26: 

GTGCTGCGCT AGTGAGCAGC GCCATGGAGG ACTCTGAAGC ACTGGGCTTC GAACACATGG 50 

GCCTCGATCC CCGGCTCCTT CAGGCTGTGA CCGATCTGGG CTGGTCGCGA CCTACGCTGA 120 

50 

TCCAGGAGAA GGCCATCCCA CTGGCCCTAG AAGGGAAGGA CCTCCTGGCT CGGGCCCGCA 130 

CGGGCTCCGG G AAGACGGC C GCTTATGCTA TTCCGATGCT GCAGCTGTTG CTCCATAGGA 240 

55 AGGCGACAGG TCCGGTGGTA GAAC-.GGCAG TGAGAGGCCT TGTTCTTGTT CCT AC CAAGG 300 

AGGTGGCACG GCAAGCACAG TCCATGATTC AGCAGCTGGG TACCTACTGT GCTCGGGATG 360 

TCCGAGTGGC CAATGTCTCA GCTGCTGAA3 AGTCAGTCTC TCAGAGAGCT GTGCTGATGG 420 

60 . 
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AGAAGGCAGA TGTGGTAGTA GSGACCCCAT CTCC-CATATT AAGCCACTTG CAC^AGACA 430 



GCC TGAAACT TCGTGACTCC CTGGAGCTTT TCGTGGTGGA CGAAGCTGAC C TTCT TT TT T 540 



5 CCTTTGC-CTT TGAAGAAGAG CTCAAGAGTC TCCTCTGTCA CTTGCCCCGG ATTTACCAGG 500 



CTTTTCTCAT GTCAGCTACT TTTAACGAGG ACGTACAAGC ACTCAAGGAG CTGATATTAC 660 



ATAACCCGGT TACCCTTAAG TTACAGGAGT CCCAGCTGCC TGCCOCAGAC CAGTTACAGC . 7?Q 

10 

AGTTTCAGGT GGTCTGTGAG ACTGAGGAAG ACAAATTCCT CCTGCTGTAT GCCCTGCTCA 730 



AGCTGTCATT GATTCGGGGC ' AAGTCTCTGC TCTTTGTCAA CACTCTAGAA CGGAGTTACC 340 



15 GGCTACGCCT GTTCTTGGAA CAGTTCAGCA TCCCCACCTG TGTGCTCAA.T GGAGAGCTTC 900 
CACTGCGCTC CAGGTGCCAC ATCATCTCAC AGTTCAACCA AGGCTTCTAC GACTGTGTCA 9 60 



TAGCAACTGA TGCTGAAGTC CTGGGGGCCC CAGTCAAGGG CAAGCGTCGG GGGCGAGGGC 1020 

20 

CMAAAGGGGA CAA.GGCCTCT GATCCGGAAG CAGGTGTGGC CCGGGGCATA. GACTTCCACC 1080 



ATGTGTCTGC TGTGCTCAAC TTTGATCTTC OCCCAACCCC TGAGGCCTA.C ATCCATCGAG 1140 



25 CTGGC^GGAC AGCACGCGCT .AACAACCCAG GGATAGTCTT AACCTTTGTG CTTGCCCCGG 1200 
AGCAGTTCCA CTTAGGCAAG ATTGAGGAGC TTCTCAGTGG AGAGAACAGG GGCCCCA.TTC 1260 



TGCTCCCCTA CCAGTTCCGG ATGGAGGAGA TCGAGGGCTT CCGCTATCGC TGCAGGGATG - 1320 

30 

CCATGCGCTC AGTGACTAAG CAGGCCATTC GGGAGGCAAG ATTGAAGGAG ATCAAGGAAG 1330 



AGCTTCTGCA TTCTGAGAAG CTTAAGACAT ACTTTGAAGA CAACCCTAGG GACCTCCAGC 1440 



35 TGCTGCGGCA TGACCTACCT TTGCACCCCG CAGTGGTGAA GCCCCACCTG GGCCATGTTC 1500 



CTGACTACCT GGTTCCTCCT GCTCTCCGTG GCCTGGTRCG CCCTCACAAG AAGCGGAAGA 1560 



AGCTGTCTTC CTCTTGTAGG AAGGCCAAGA GAGCAAAGTC CCAGAACCCA CTGCGCAGCT 1520 

40 ■ 

TCAAGCACAA. AGGAAAGAAA TTCAGACCCA. C*S3CC*JkGCC CTCCTGAGGT TGTTGGGCCT 1630 



CTCTGGAGCT GAGCACATTG TGGAGCACAG GCTTACACCC TTCGTGGACA GGCGAGGCTC 1740 



45 TGGTGCTTAC TGCACAGCCT GAACAGACAG TTCTGGGGCC GGCAGTGCTG GGCCCTTTAG 1300 



CTCCTTGGCA CTTCCAAGCT GGCATCTTGC CCCTTGACAA C^GAATAAAA ATTTTAGCTG 1360 



CCCCAAAAAA AAAAAAAAAA AAAAAAACTC GAGGGGGGGC CCGTACCCAA TTCGCCCTAT 1320 

50- 

AA. . 1922 



55 

(2) I MFC 1 RMAT ION FOR SHQ ID MO: 27: 

(i) SEQUENCE CKAPACTEPJISTICS : 

(A) LENGTH: 1951 base pairs 
60 (3) TYPE : nucleic acid 
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35 



(C) STRANDEDNE3S : double 

(D) TGPOLCG*'": linear 



(xi)' SEQUENCE DESCRIPTION: SEQ ID NO: 27: 

5 

TCGTCCCCAG AGCGGGCTGA GCCCCA.GGCG SAGGGTGGCG GGGGAGCCTG GGGGAGCCGC 50 
CGCCACCTCC ACGGGCCTCT CTGAGCTCGG AjCACCAGCGC CCTGTCCTAT GACTCTGTCA 120 
10 AGTACACGCT GGTGGTAGAT GAGCATGCAC AGCTGGAGCT GGTGAGCCTG CGCCGTGCTT 130 



CGGAGACTAC AGTGACGAGA. GTGACTCTGC CACCGTCTAT GACAACTGTG CCTCCGTCTC 240 

CTCGCCCTAT GAGTCGGGGA TCGGAGAGGA ATA.TGAGGAG GCCCCGCGGC CCCAGCCCCC 300 

TGCCTGCCTC TCGGAGGAAC TCCACGCCTG ATGAACCCGA CGTCCATTTC TCCAAGAAAT 250 

TCCTGAACGT YTTCATGAGT GGCCGCTCCC GCTCCTCCAG TGCTGAGTCC TTCGGGCTGT 420 

20 TCTCCTGCAT CA.TCAACGGG GAGGAGCAGG AGCAGACCCA CCGGGCCATA TTGAGGTTTG 430 

TGCCTCGACA CGAAGA.CGAA CTTGAGCTGG AAGTGGATGA CCCTCTGCTA GTGGAGCTCC 540 

■ AGGCTGAAGA CTACTGGTAC GAGGCCTACA ACA.TGCGCAC TGGTGCCCGG GGTGTCTTTC 600" 

25 

CTGCCTATTA CGCCATCGAG GTCACCAAGG AGCCCGAGCA CATGGCAGCC CTGGCCAAAA 650 

ACAGTGAGTG GGTGGACCAG TTCCGGGTGA AGTTCCTGGG CTCAGTCCAG GTTCCCTATC 720 
30 ACAAGGGCAA TGACGTCCTC TCTGCTGCTA TGCAAAAGAT TGCCAGCACC CGCCGGCTCA- . 730 



CCGTGCACTT TAACCCGCCC TCCAGCTGTG TCCTGGAGAT CAGCGTGCGG GGTGTGAAGA 340 

TAGGCGTCAA GGCGGATGAC TCCCAGGAGG CCAAGGGGAA TAAATGTAGC CACTTTTTCC 900 

AGTTAAAAAA CATCTCTTTC TGCGGATATC ATCCAAAGAA CAACAAGTAC TTTGGGTTCA 960 

TCACGAAGCA CCCGGCGGAC CACCGGTTTG CCTGCCACGT CTTTGTGTCT GAAGA.CTCCA • 1020 

40 CCAAAGCCCT GGCAGAGTCC GTGGGGAGAG CATTCCAGCA. GTTCTACAA.G CAGTTTGTGG. 1Q30 

- AGTACACCTG CCCCAGAGAA GATATCTACC TGGAGTAGCT GTGCAGCCCC GCCCTCTGCG 1 140 

TCCCCCAGCC CTCAGGCCAG TGCCAGGACA GCTGGCTGCT GACAGGATGT GGCACTGCTT 1200 

45 

GAGGAGGGGC ACCTGCCACC GCCAGAGGAC AAGGAAGTGG GGCGCTGGCC CAGGGTAGGG 1260 

GAGGGTGGGG CAATGGGGAG AGGCAAATGC AGTTTATTGT AATATATGGG ATTAGATTCA 1320 

50 TCTATGGAGG GCAGAGTGGG CTGCCTGGGG ATTGGGAGGG ACAGGGCTTG GGGAGCAGGT 1330 
CTCTGGCAGA GAAGGATGTC CGTTCCAGGA GCACACGGCC CTGCCCCATC CTGGGCCTTA ' 1440 

CCTCCCCTGC CAGGGCTCGG GCGCTC-TGGC TCCTGCCTTG ATGAAGGCGG TGTGCTGCCT 1500 ■ 

55 

TGATGAAGCC TGTGCCACCT C<1AAGTGCCC QCCCTQCQCC TGCCCCAACC CCCACCGAAG 1560 

AGCCCTGAGC TCAGGGTGAG CCCAGCCACG TCCCAAGGAC TTTCCAGTGA GGAAATGGCA 1620 

60 ACACGTGGAG GTGAAGTCGG TGTTC7CAGC TCCGTCATCT GCGGGGCTTC TGGGTGGCTC 1630 
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CTGCCACTGA CCTCACCGGC ATGCTGGCCT GTGGCAGGCC TAGGACCTCA GKGGGGftGG 1740 

AGGAGCTGCC GCAAGGCCC? GTCCCAGCAG AAGAGGGAGG CTTCCTGACT GACACAGGCC 1300 

AGCCCCATCT TGGTCCTGTC ACCCTGGCCC GAACTATTXA AGTGCCATTT CCTGTCAAAA 1360 

AAAAAAAAAA AAAATCGGGG GGGGCCCGGA AMCCAA.TTTC CCCCAAAAAG GGGGGTTATA 192.Q 

AAAATTCCCN GGCMGTGTTT TTAAAAATTC G 13 Si 



(2) INFORMATION FOR SEQ ID NO: 23: 

(i) SEQUENCE CHMACTERISTICS : 

(A) LENGTH: 3989 base pairs 
(3) TYPE: nucleic acid 

(C) STRANDEENE3 3 : double 

(D) TOPCLCGY: linear 

(:ci) SEQUENCE DESCRIPTION: SEQ ID NO: 22 : 



GGCACAGGCC GCAGGGNACC TATGGGCGCA TATAGGTTGT AATGAAACTG TAGTCTCAGT 60 

TGGAACCCTA GACATGAAAT GGGTCAGTGA GCAAGGCTCT ATTCCTAGTC TCCAGCCATG 120 

CCTGTGGAAC CTGAF.CCCRC TCTCAGCACA TTGGACCGAG GCAGATGYAA AAAATTCACA. 130 ' 

GAACTATGAT TTGGACTCAA GGGTTTGTAG ATTTCCTCCT TCATTCTAAT TTCAGTGTCT 240 

AAAATTCTTG CATCC RTG AA CGAGCTGGGC ATTTGATGAG ACAGGGCYGA ATACTGCAGT 300 

TTTCCTCCTA GAAATCATCT GGGGCATTTT CTTTGAACTG ATGGGAACAA TA.-GGCATAA. "360 

CTGTTTGCAC AAACTTGGGA TAAP.TGATTT TGGGATAACG ATGTACCAGA ATGGGGATAT '420 

TTCACCCTTG GTTCTGAGAT GCAAACCAAA. GAATATCATG ACCACCTTTC AGGCC TCCTG 430 

" AAGTATA.TCT CTCACATTGT CCTGTTCTCA TGCTGAGGAG CCTGAGATCC CTGTGTGGGG 540 

ATTAGA.GAGT GGACTGTTAT GGGTGTAGGT GAATTGGCTT ATTTTGTCTG TCCCTGTCTG 600 

AATGTATTGC AGGAAYTAAA- AAGGACCAAG AAGAGGAAGA AGACCAAGGC CCACCATGCC 660 

CCAGGCTCAG CAGGGAGCTG CTGGAGGTAG TAGAGCCTGA AGTCTTGCAG GACTCACTGG 720 

ATAGATGTTA TTCAACTCCT TCCAGTTGTC TTGAACAGCC TGACTCCTGC GAGCCCTATG 730 

GAAGTTCCTT TTATGCATTG GAGGAAAAA.C ATGTTGGCTT TTCTCTTGAC GTGGGAGAAA 340 

TTGAAAAGAA GGGGAAGGGG AAGAAAAGAA GGGGAAGAAG ATCAAA.GAAG GAAAGAAGAA 900 

GGGGAAGAAA AGAAGGGGAA GAAGATCAAA ACCCACCATG CCCCAGGC7C AGCAGGGAGC 960 

TGCTGGATGA GAAAGKGCCT -GAAGTCTTGC AGGACTCACT GGATAGATGT TATTCAACTC 1020 

CTTCAGTTGT GTTGAACTGT GTGACTCATG CCAGCCCTAC AGAAG7GCCT TTTATGTATT 108jO ' 
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GGAGCAACAG CATGTTGGCT TGGCTGTTGA 
GGAAGAAGAC OAGACCCAT CATGCCCCAG 
5 GCCTGAAGTC TTGCAGGACT CACTGGATAG 
ACTGCCTGAC TTACGCCAGC C CTACAGC AG 
TGGCTTrCKGT CTTGACGTGG ASAAATTGAA 

10 

AGAAjGATCAA AGAAGGAAAG AAGAAGGGGA 
CCATGCCCCA GGCTCAGCAG GGAGCTGCTG 
15 TCACTGGATA GATGTTATTC AACTCCTTCA 
CCCTACAGAA. GTGCCTTTTA YP.TATTGGAG 
GATGAAATTG AAAAGTACCA AGAAGTGGAA 

20 

AGCAGGGAGC TGCTGGATGA GAAAGAGCCT 
TA.TTCGACTC CTTCAGGTTA TCTTGAACTG 
25 GTTTACTCAT TGGAGGAACA GTACCTTGGC 
GACCAAGAAG AGGAAGAAGA CCAAGGCCCA 
GAGGTAGTAG AGCCTGAAGT CTTGCAGGAC 

30 

AGTTGTCTTG AACAGCCTGA CTCGTGCCAG 
GAAAAACATG TTGGCTTTTC TCTTGACGTG 
35 AAAAGAAGGG GAAGAAGATC AA£-fGAA.GRAA 
G ATCAAAAC C CACCATGCCC CAGGCTCAAC 
GTCTTACAGG ACTGACTGGA TAGATGTTAT 

40 

■ GACTCATTCG AGCACTACAG AAGTGTGTTT 
GCCCTTTACG T GGAC AA'T AG GTTTTTTACT 
45 CAGATGGGAG TCATATTCCC ACAATAAGCA 
GCAGGCAGGA CCIATAGGCA MGTGAAGATT 
CAGACATAGG ATGGGTCAGT GGGCATGGCT 

50 

AACCTGTGCT GAGTCTGAAG AGAATGGACC 
TGCAGCAGAT GCGGGGAGTG ATCAGTCRGA 
-5 AGCTACAAAA TTCCTCAGGG ATTTCATTTT 
TCAAGGTCAX TGTCATCTTT GTGTTTAGCT 
ACCTAACCTC ATTCTTTGTG TCTTCAGTGT 

60 



236 

CATGGATGAA ATTGAAAAGT ACCAAGAAGT 1140 

GCTCAGGAGG GAGCTGCTGG ATGAGAAAGA 1200 

ATGTTATTCG A2TCCTTCAG GTTATCTTGA -1250 

TGCKGTTTAC TGATTGGAGG Ai'C^TACCT 1220 

AAGAAGGGGA AGGGGAARAA AAGAAGGGGA . 1330 

AGAAAAGAAG GGGAAGAAGA TCAAAACGCA 1440 

GATGAGAAAG GGCCTGAAGT CTTGCAGGAG 1300 

GGTTGTCTTG AACTGACTGA CTCA.TGCCAG 1360 
CAACAGYGTG TTGGCTTGGC TGTTGACATG ■ 1520 

GAAGACCAAG *GCCATCATG CCCCAGGCTC 1630 

GAAGTCTTGC AGGACTCACT GGATAGATGT 1740 

CCTGACTTAG GCC^CCCTA C-.GCAGTGGT 1300 

TTGGCTCTTG ACGTGGACAG AATTAAAAAG 13 60 

CCATGCCCCA GGCTCAGCAG GGAGCTGCTG 1920 

TCACTGGATA GATGTTATTC AACTCCTTCC 1930 

CCCTATGGAA GTTCCTTTTA TGCATTGGAG 2040 

GGAGAAATTG AAAAGAAGGG GAAGGGGAAG 2100 • 

AGAAGAAGGG GAAGXAAAGA AGGGGAAGAA 2160 

GGCGTGCTGA TGGAAGTGGA AG AGC STG AA 2 220 

TCGACTCCGT CAATGTACTT TGAACTACCT - 2230 

TACTCATTTG AGGAACAGCA CATCAGCTTC 2340' 

TTGAOGGTGA CAAGTCTCCA CCTGGTGTTC 2400 

GCCCTTASTA ■ AKCCGAGAGA TGTCATTCCT 2460 

TGAATGAAAG TACAGTTCCA TTTGGAAGCC 232Q 

CTATTCCTAT TCTCAAACCA TGCCAGTGGC 2530 

CAC-GTTAGGT- GTGACACGTT CACATAACTG 2540 

CATTTT AA.TT ' TG AACCACGT ' ATCTCTGGGT 2700 

GCAGGCATGT CTCTGAGCTT CTATACCTGC 2760 

CATCCAAAGG TGTTACCCTG GTTTCAATGA 2320 

TGGCTTGTTT TAGCTGATCC ATCTGTA^CA 2330 
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CAGGAGGGAT CCTTGGCTGA GGATTGTATT TCAGAACCAC CAACTGCTCT TGACAATTGT 2940 

T AAC CCGCTA GRCTCCTTTG GTTAGAGAAG CCACAGTCCT TCAGCCTCCA ATTGGTGTCA 3000 

5 GTACTTAGC-A AGACCACAGC TAGA.TGGACA AACAGCATTG GGAGGCCTTA GCCCTGCTCC 3060 

TCTCSATTCC ATCCTGTAGA GAAXLAGGAGT CAGGAGCCGC TGGCA.GGAGA CAGCATGTGA 2120 

CCCAGGACTC TGCGGGTGCA GAATA.TGAA.C AAYGCCATGT TCTTGCA.GAA AACC-CTTAGC ^30 

10 

CTGAGTTTCA TAGGAG-GTAA T CA.CC AG AC A ACTGCAGAAT GTRGARCACT GAGCAGGACA 3240 

GCTGACCTGT CTCCTTCACA TAGTCCATRT CAC'CACAAA.T C-.CACAA.CAA AAAGGAGARG 3 300 

15 AGATATTTTG GGTTCAAAAA AAGTAAAAAG ATAATGTAGC TGCATTTCTT TAGTTATTTT 3360 

GARCCCCAAA TATTTCCTGA TCTTTTTGTT GTTGTGATKG ATGGTGGTGA CATGGA.CTTG 3420 

'TTTATAGAGG ACAGGTC.AGG TGTCTGGCTC AGTGATCTAC ATTCTGAAGT TGTCTC-AAAA. 3430 

20 

TGTCTTCATG ATTAAATTCA GCCTAAACGT TTTGCCGGGA ACACTGCAGA G^.CAATQCTG 3 540 

TGAGTTTCCA ACCTYAGCCC ATCTGCGGGC AGAGAAGGTC TAGTTTGTCC AT C A3 C ATT A 3600 

25 TCATGATATC AGGA.CTGGTT ACTTGGTTAA GGAGGGGTCT AGGAGATCTG TCCCTTTTAG 3 560 

AGACACCTTA CTTATAATGA AGTATTTGGG* AGGGTGGTTT TCAAAATTAG AAATGTCCTG 3720 

TATTCCPA.TG ATCATCCTGT AAACATTTTA TCATTTATTA ATGATCCCTG CCTGTGTCTA 3730 

30 

TTATTATATT CATA.TCTCTA CGCTGGAAAC TTTCTGCCTC AATGTTT ACT GTGCCTTTGT 3 340 

TTTTGCTAGT GTGTGTTGTT GAAAAAAAAA ACATTCTCTG CCTGAGTTTT AATTTTTGTC 3 900 

35 CAAAGTTATT TTAATCTATA CAATTAAAAG CTTTTGCCTA TCAAAAAAAA AAAAAAAAAA 3 960 

AJ^AAAAAAAA AAAAAGCGGA CGCGTGGGC 3 939 

40 

(2) IMFOKMATIQM FOR SEQ ID NO: 29: 

' (i) SEQUENCE CHARACTERISTICS : 

45 (A) LENGTH : 3735 base pairs 

(3) TYPE: nucleic acid 

(C) STHANDEDNESS : double 

(D) -TOPOLOGY: linear 

50 (xi) SZQUZZ1CZ DESCRIPTION: SEQ ID NO: 29: 

CTGCTGTTCG CTGGCTGGGC TCCGCAGCA.G GCTTGGCCAG CSGCTGACGG GTCGGCGGGC ^50 

GGGTTTGTGT GAACAGGCAC GCAGCTGCAG ATTTTATTCT GGTAGTGCAW CCCTCTCAAA 12 Q 

55 

GGTTGAAGGA ACTGATGTAA CAGGGATTGA AGAAGTAGTA ATTCCAAAAA AGAAAACTTG 13 0 

GGATAAAGTA GCCGTTCTTC AGGCACTTGC ATCCACACTA AA.CACGGATA. CCACAGCTGT 240 

60 C-CCTTATGTG TTTCAAGATG ATCCTT AC CT TATGCCAGCA TCATCTTTGG AATCTCGTTC 300 
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ATTTTTACTG GCAAAGAAA.T CCGGGGAGAA TGTGGCCA-.G TTTATTATTA ATTCATACCC 360 

CAA.ATATTTT CAG.-AGGACA TAGCTGPuA.CC TCATATACCG TGTTTAATGC CTGAGTACTT 420 

5 

TGAACCTCAG * ATCAAAGACA TAAGTGAAGC OGCOCTGAAG GAACGAATTG AGCTCAGAAA 430 

AGTCAAAGC C TCTGTGGACA TGTTTGATCA GCTTTTGCAA GCAGGAACCA CTGTGTCTCT 540 

10 TGAAACAACA AATAGTCTCT TGGaTTTWTT GTGTTACTAT GGTGACCAGG AGCCCTCAAC 600 

TGATTACCAT TTTCAACAAA CTGGACAGTC AGAAGCATTG a-A.GAGG.iAA ATGATGAGAC 560 

ATCTAGGAGG AArGCTGGTC ATCAGTTTGG AGTTACATGG CG AGCAAAAA ACAACGCTGA 720 

15 

GAGAATCTTT TCTCTAATGC CAGAGAAAAA TGAACATTCC TATTGCACAA TGATCCGAGG 730 

AATGGTGAAG CACCGA3CTT ATGA3CAGGC ATTAAACTTG TACACTGAGT TACTAAACAA 340 

20 CAGACTCCAT GCTGATGTAT AC^ATTTAA TGCATTGATT GAAGCAACAG TATGTGCGAT 900 

AAATGAGAAA TTTGAGGAAA AATGGAGTAA AATACTGGAG CTGCTA^GAC ACATGGTTGC 960 ' 

ACAGAAGGTG AAACCAAATC TTCAGACTTT TAATACCATT CTGAAATGTC TCCGAAGATT 1020 

25 

TCATGTGTTT GCAAGATC-GC CAGCCTTACA , 'GGTTTTACGT GAAATGAAAG CCATTGGAAT 1030 

AGAACCCTCG CTTGCAACAT ATCAGCATAT TATTCGCCTG TTTGATCAAC CTGGAGACCC 1140 

30 TTTAAAGAGA TCATCCTTCA TCATTTATGA TATAATGAAT GAATTAATGG GAAAGAGATT 1200 - 

TTCTCCAAAG GACCCGGATG ATGATAAGTT TTTTCAGTCA GCCATGAGCA TATGCTCATC 1250 

TCTCAGAGAT CTAGAACTTG CCTACCAAGT ACATGGCCTT TTAAAAACCG GAGACAACTG 1220 

35 

GAAATTCATT GGACCTGATC AA.CATCGTAA. TTTCTATTAT TCCAAGTTCT TCGATTTGAT 1380 

TTGTCTAATG GAACAAATTG ATGTTACCTT GAAGTGGTAT G AGG AC CTGA TACCTTCAGC 1440 

40 CTACTTTCCC CACTCOCAAA CAATGATACA TCTTCTCCAA GCATTGGATG TGGCCAATCG 1500 

GCTAGAAGTG ATTCCTAAAA TTTGGAAAGA TAGT^AAGAA T ATGGT CAT A CTTTCCGCAG 1560 

TGACCTGAGA GAAGAGATCC TGATGCTCAT GGCAAGGGAC AAGCACCCAC CAGAGCTTCA 1620 

45 

•GGTGGCATTT GCTGACTGTG CTGCTGATAT CAAATCTGCG TATGAAAGCC AACCOATCAG 1530 
ACAGACTGCT CAGGATTGGC CAGCCACCTC TCTCAACTGT ATAGCTATCC TCTTTTTAAG ■ 1740 

50 GGCTGGGAGA ACTCAGGAAG CCTGGAAAAT GTTGGGGCTT TTCAGGAAGC ATAATAAGAT 1300 

TCCTAGAAGT GAGTTGCTGA ATGAGCTTAT GGACAGTGCA AAAGTGTCTA ACAGCCCTTC 1360 

CCAGGCCATT GAAGTACTAG AGCTGGCAAG TGCCTTCAGC TTACCTATTT GTGAGGGCCT 1920 

55 

CAC CCAGAGA GTAATGAGTG ATTTTGCAAT CAACCAGGAA CAAAA.GGAAG CCCTAAGTAA 1930 

TCTAACTGCA TTGACOGTG ACAGTGATAC TGACAGCAGC AGTGACAGCG ACAGTGACAC 2040 

60 CAGTGAAGGC AAATGAAAGT GGAGATTC.-G CAGCAGCAAT GGTCTCACCA TAGCTGCTC-G 2100 
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AATCACACCT GAGAACTGAG ATATACC-AT ATTTAA.CA.TT GTTACAAAGA AGAAAAGATA. 2160 

CAGATTTGGT GAATTTGTTA CTGTGAGGTA. CAGTCAGTAC ACAGCTGACT TATGTAGATT 2220 

5 

TAAGCTGCTA ATATGCTACT TAACCATCTA. TTA-ATGCACC ATTAAAGGCT TAGCA.TTTAA. 2230 

GTA.GCAA.CA-T TGCGGTTTTC AGACA.CA.TGG TGAGGTCCAT GGCTCTTTGTC ATCAGGATAA 2340 

10 GCCTGCACAC CTAC-AGTGTC GGTGAGCTGA. CCTCAGGATG CTGTCCTCGT GCGATTGCCC 2400 

TCTCCTGCTG CTGGACTTCT GCCTTTGTTG GCCTGATGTG CTGCTGTGA? GCTGGTCCTT 2460 

CATC TTAGGT GTTCATGCAG TTCTAACACA GTTGGGGTTG GGTCAATAGT TTCCCAATTT 2 5 20 

15 

CAGGATATTT CGATGTCAGA AATAACGCAT CTTACGAATG ACTAAACAAG ATAATGGCAC- 2530 

TTTAGGCTGC ACAACTGGTA AAATGACTGT AGATAAATGT TGTAATTA.GT GTACACGTTT >2S40 

20 GTATTTTTGT TAATATACCC GCTGCCATAG TTTTCTAACT TGAACAGCCA. TGAA.TGTTTC 2700 

ATGTCTCCCT TTTTTTTTTO TCTATAGCTG TTACCTATTT TAGTGGTTGA AATGAGAGCT 27 60 

AGTGATGACA GAAGGATGTG GAATGTCTTC TTGACATCA-T TGTGTATTGC TGGTAA.TCAA. 2320 

25 

GTTGGTAACG ACTACTTCTA GCAGCTCTTA CCACTATGA.C TTAAGTGGTC GTGGAAGGCA 2330 
GTAAGTGGAG GTTTGCAGCA TTCCTGCCTT CATGA.GGGCT TCTACCACTG ACCACTTTGC - 2940 

30 ACGTACCTGG CTCCCAGATT TA.CTTAGGTA CCCCACGAGT CGTCCACA.TA AGCAGCTTCA 3000 

TCTTTACCTT GCCAGAGTTG ACAA.TTATGG GA.TAGTCTAG TCTACTTATA CTTGTGTTCC 3060 

CATCTGTCTG CCATCCTCTG AAGGCCAGGA CCCAGTCATA CATCCTTAGA AACCAAAGTA 3120 

35 

TGGTTTTTGT TTTCTCTTGG AATGTCAGGT CTTAAGGCAT TTAATTGAGG GA.CAAAAAAA 3130 - 

AAAA.AAAGCC GATATAGTAG CTAGCTACTT AAGCATCCAT GGGTATTGCT CC AT AT CAAA 3240 

40 GGAGATTTGC AGGACAGAAA GA.GTAAATTA GCCTTCAGTC TTGGTTTACA GCTTCCAAGG 3300 

AGAGCCTTGG CCACCTGAAA. TGTTAACTCG GTCCCTTCCT GTCTCT AGTT - CATCAGCA.CC 3 350 

TGCAGATGCC TGACTCTTGT TAGCCTTACT ATTCAA.TACA GTCCTTAGAT TCAGGGTATG 3420 

45 

CCTCTTCCTA TCCAGGCACC TATTCTGAAT CACCATGTTG CTCTGCAGCT AGAGTTGA.TA 3430 

GGA.GAAAATC CATTTGGGTA GATGGCCTAT GAATTTGTAG TAGACTTTCA AAATGAGTGA 3S40 

50 TTTGTTAGGT TGGTACTTTT AAGTTTGTGG TAGAGATCCT CCAAACCCAT ACTCTGAGCA. 3500 

ATTAAGTGCC TTGAACATAG AGAAAATTAA GGCCTCACAG GATGAGTCTC CATTCTCTGT 3 560 

AAATGCTTAT TTTATCATAG TCTTTAGCCN CTACTATGAG TAAAATGTTC TCTTOTGCCG 3720 

GGTGTGGTGA CTCAC 373 5 



55 
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(2) INFORMATION FOR SEQ ID MO: 30: 

(i) SEQUENCE C-ailACTHRISTXCS : 

(A) L2TCTH: 1667 base pairs 
(3) TYPE: nucleic acid 
(CJ STHAND3DME3S : double 
(D) TOPOLOGY: linear 

t:<i} SEQCJEMCE DESCRIPTION: SEQ ID NO: 30: 

TAGTAATTCA TTTAACTCCT CTTACATGAG TAGCGA.CAAT GAGTCAGATA TCGAAGATGA 50 

AGACTTAAAG TTAGAGCTGC GACGACTACG AGATAAACAT CTCAAAGAGA TTCAGGAC CT 120 

15 GCAGAGTOGC CAGAAGCATG AAATTGAATC TTTGT AT AC C AAACTGGGCA AGGTGCCCCC 130 

TGCTGTTATT ATTCCCCCAG CTGCTCCCCT TTCAGGGAGA AGACGACGAC CCACTAAAAG 240 

CAAAGGCAGC AAATCTAGTC GAAGCAGTTC CTTGGGGAAT AAAAGCCCCC AGCTTTCAGG 300 • 

20 

TAACCTGTCT GGTCAGAGTG CAGCTTCAGT CTTGCACCCC CAGGAGACCC TCCACCCTCC 3 60 

TGGCAACATC CCAGAGTCCG GGCAGAATCA GCTGTTACAG CCCCTTAAGC CATCTCCCTC 420 

25 CAGTGACAAC CTCTATTCAG CCTTCACCAG TGATGGTGCC ATTTCAGTAC CAAGCCTTTC 430 

TGCTCCAGGT CAAGGAACCA GCAGCACAAA CACTGTTGGG GCAACAGTGA ACAGCCAAGC 540 

-CGCCCAAGCT CAGCCTCCTG CCATGACGTC CAGCAGGAAG GGCACATTCA CAGATGACTT 600 

30 

GCACAAGTTG GTAGACAATT GGGCCCGAGA TGCCATGAAT CTCTCAGGCA GGAGAGGAAG 660 

CAAAGGGCAC ATGAATTATG AGGGCCCTGG AATGGCAAGG AAGTTCTCTG- CACCTGGGCA 720 

35 ACTGTGCATC TCCATGACCT CGAACCTGGG TGGCTCTGCC CCCATCTCTG CAGCATCAGC 780 

TACCTCTCTA GGTCACTTCA CCAAGTCTAT GTGCCCCCCA OGCAGTATG GCTTTCCAGC 340 

TACCCCATTT GGCGCTCAAT GGAGTGGGAC GGGTGGCCCA GCAGCACAGC CACTTGGCCA 900 

40 

GTTCCAACCT GTGGGAACTG CCTCCTTGCA GAATTTCAAC ATCAGCAATT TGCAGAAATC ' 960 

CATCAGCAAC CCCCCAGGCT CCAACCTGCG GACCACTTAG ACCTAGAGAC ATTAACTGAA 1Q20 

45 TAGATCTGGG GGCAGGAGAT GGAATGCTGA GGGGGTGGGT GGGGGTGGGA AGTAGCCTAT 1Q80 

ATACTAACTA CTAGTGCTGC ATTTAACTGG TTATTTCTTG CCAGAGGGGA ATG T TTTTAA 1140 

TACTGCATTG AGCCCTCAGA ATGGAGAGTC TCCCCCGCTC CAGTTATTGG AATGGGAGAG T?00 

GAAGGAAAGA ACAGCTTTTT TGTCAAGGGG CAGCTTCAGA CCATGCTTTC CTGTTTATCT 12*60 

ATACTCAGTA ATGAGGATGA GGGCTAGGAA AGTCTTGTTC ATAAGGAAGC TGGAGAACTC 1220 

55 AATGTAAAAT CAAACCCATC TGTAATTTCG AGTGGGTGGA GCTCTTGCTT TTGGTACATG 1380 

CCCTGAATCC CTCACTCCCT CAAGAATCCG AACCACAGGA GAAAAACCAC CTACTGGGCT 1440 

CTCTCCTACC CTGCCCTCCT CCCTTTTTTT TACCCCTCTC TTTTXTATT^ f r^rz f T rr ^CCy "500" 

60 
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CTTTAGAAC:: CAGTXjAAAAA TACCAGGGTA CTGGGGTGCA ACTCTTTGTT ATGATAGG7G IS 60 

ATTAGTGCTT TAAGCAAAAG ATATTAGCAG CTTTGACTGC AGCATTAGCA. ATTAGGRAAA 1320 
AAAAAAAMWA AAAACTCGAG GGGGGGCCCG GTTAGCCAAT TCGCCCT 1-567 
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(2) IMFOFKATION FOR SEQ TD NO: 21: 

(i) SZQUE8C2 CHAHACTSF.1 ST ICS : 

(A) LHMGTK : 1408 base pairs 
(3) TYPE: nucleic acid 
(CJ STRANDEDfiiE33 : double 
(D) TOPGLCGY: linear 



Cci) 



DESCRIPTION: SEQ ID NO: 31: 



ATTACACACC TGAGCACTGT GCCTGGCAAG ACCTC-TCTTA ATAGATTAGA 
T AG ATGGT CA GCXTTCTGTA GCAGTGAGAA CCCTACATTT CAAATGTGGA 
GCGGGGAAAC ATC^CTTGGC ACATCTGCAT TCTT TTT TGA CACAGGGTCT 
CCGAGGCTAG AGTGCATGGC ACGATCTTAG CTCACTGCAA CCTCCACCTC 
GCGATTCTTC TGCCTCAGCC TCCTC-AGCAG CTGGGATCAC AGACATGCGC 
AGCTAATTTT TTGTATTTTT TGTSCTGTTTG TTTTTGTTTK TAAGTAGAGA 
CCACGTTGGS CAGGCAGGTC TCGAACTCCT GAMCTCAGGT GATCCACCCA 
CCAATATCTT TCTCAACATA ATGATAGCCG TAATTAATAT TTTCCAGTAC 
CTTTACACAC GAGAGTGGTA GACAGACACA AACCCAGATC TGTCTGACTC 
TTGTCATCAT TCCTTTTACG GTAT CCTATA GTGGTATCCT TTACAGAAAG 
CCCAACAAAG ACTTAACTTC CCAGGATGCC AGAAGGACAA AGCGGGATTG 
GKAAGTTATC AAGAMCTTAT TTTATAAATG AGATTAGATA GGGAAAGGCA 
ATTAAAAACT GAAAAGGCCA GCATAGGGAA GGAGGTCCTT • CGGTGGTCTT 
ATACTTGAGT TGCTTTTATT AGAAACAGAT AGTACCTAAG GTTTTGAGGT 
TAAGGCATGC TAATGKTCAT GGGTCCTTCC ATAGTCATTT TKGTATTTTG 
GAGCAATAGG G AGCCCTTGA CTGCTGCTGG AYTCATTCCT GCCA.YTATTA 
AGGAGACAGG AGGTATGTCT TTTCTATTTT TAWACATGCT TTATATTTAA 
TGGGTATGTT AGATAAACAG AAGTTGCCTA GCACTCCTTT TAGTGCATTG 
CATTTAAGCA AAATAATAAA CAGTCTTTTG AGGTTCCTTA ACAATGAAAC 
GGCAGCAGCG GAATCCATGC YTCTTCTCCT GGAGTGTGGA AKAGTCCGTG 
T CTC ACAC AG ATGTGGCATT TTATGTGTGA TGCTCTAATT AAGGCCATTG 



60 
120 
130 
240 
200 
360 
420 
430 
540 
500 
660 
720 
73Q 
340 
.500 
960 
1020 
1080 
1140 
1200 
1260 
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PilATTCAGAC GTCCTCTCAG A-AJTAATGCA TTCTTTTGCA AAGGTGAATA TTTTTCTCTT 
AAAAAATADG TATAACGTGG TATGTTO.TT TATTAGTCTT GCTAAAAAAA AAAAMAAAA 
AJCTOKSMGG GGGSrCGGGT ACO^r? 

(2) INFCSI-STiCT *C?. SZQ ID MO : 32: 

(A) L-EIJCr:-:; 2921 base pairs 

(3) TjTPZ: uuclaic acid 

(C) STPAl-IDZDMZSS : double 

CD) TO POLCGi": linear 

(xi) DE3C?J:?TXCN : SEQ ID MO: 22: 

AGGATATGCA TGATTCTTAA CCSGGCTATA TGTTAAAAAA AAATTGGAAA ATGCAATACA 

rmTTA CTA TACAAACTAC AGAATGAGTA TGCAAGTTTT ATTTATCAAA ATGTAATGGA 

TTTTTAAAGG CTSAGAAATT TTCCTTATAC CTACCTTTTC PsGTTATTTTA ATTATACCAA 

A.rTATCAACT AGAATAGGTT CATCCATATG AAATATAAAA TGAAGAGACA CCTAGGCTCT 

ATCAGGCTTA GGATTCTTTG AnCTTATTTC CACTTTAATT TCTCAGTGGA AGTTAAGAGG 

C<7TGAGAAAA CAAAGAAGGG GA-AAACTGA CAACTAACAA AACCAGCACC ACATCGCTAG 

GTGGTGCTTA CTAATTACCT TCTCAGGATT TTCCTCAGAT TGAAAAGCTT ATGAGGATTT 

CTTGGGAGTC TTAATAACCT GCCTGTTAGT ACAGAGCTTT CCTGATGATA TTTACTCTTG 

AGCACATGTG GTTGTAAAAC CTTSuACTTTC TTTCTCCAGG AGGGTGGTGA TAGAAACA3A 

XGGTAGTA.iT TAITGAACTGA TGTTCTCGTG >AATGTTGAG GGTGGGGAGA AAnGnCTTTA 

AGGGAGGAGA GCCATCTATT TTGTTCCTAA AGCCACCTCT CAGCAGAATC GTCATGTTTT 

TCTGATGCAC CGCTCTGCTT CATGCCCAA3- ATGAC 1TGCG AGGCAATCTC AGGAGCTGTG 

GACTTAACC2. TTGCAAAGCA C-GTGTCTTT CTCAGCGTTC TCTGCAAGTC AGTAGGTGTT 

AGTATGGTTG CAAAGTTCAC TGTCTCAGCA AAGTTGAACT GGGCTACCTC TCTACAGCTG 

TTTCCTCAGA GGGAAAArvTC TTGAGACCAG ATGGTGGAGC TCTGGA3TCA GAGGAAATGG 

GTGTCTTCAG CACAAAGCTG CTGCTTTTAC . TTCAGCCACT TCTGACATTT TTACATACCG 

AGCCTGAGAT TFJTGTGATTA TCTCAAATCA AATCACTTTG ATGGAGATAA ATAATCAAAA 

CTGTTTTATA GTGATTGATT TGGTGAG.-AC AGTAATGGAA AATGGTGTTG AAGGACTTCT 

CATTTTTGGA G CTTT CCTTC C-GAGTCCTG GCTGATTGGT GTTCGCTGTT CATCTGAGGC 

CCCAAAAGCA TTATTACTGA TACTTGCACA CAGTCArAAG CGCAGACTGG ATGGATGGTC 
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TTTTATAAGG CATTTAAGGG TACACTACTG TGTTTCACTG AC CAT ACATT TTTCTTAGCC 12 50 

CCTCAA.GTAA TATAGCACAG &GTTATGAAT GACAATTC C C CTAACCATTC CTCTTCATA.T 1320 

CTGCCTCTTC CCCTTACCAT CGTAATTCTC C-AACTGG7C ATAAAC-GCAC TCTGTGAAGA 15 SO 

TATTGGGGAC TGACATCTTA AGCTCTCACC TGGCTGCAGT AGGAAAGGCC AAACTGACGA 1440 
CAAAAAAAAA ATTCTTTATA AAGATGATAT GGTAACATGT ATCTTTGCCC TGGGTCTGGG . 1500 

TGGGTCCAGT CAGTCTCAGA TTTACAAGCA TTTAGGAGCC TAGGTAAAAG CTGCTAGTAT 1560 
TCTTTTAAAA GTTACATTTA TGACTTGCAA TGATAGAAAA CTCCTTCCAA TTAAATGGCA " 1620 

1 5 TTTTATAATA TTATGTGTGT ACTTCACAGT GTTAAAAATA CCCTCATACG TTATTCCATT 1530 

TGATCTTCAC AGAAAGTGCA TTTTAACCAG TACTCTGGGT GCAA.TAA-ATA ATATGTAGAA 1740 

ATTTAAGTCC TCCAATTCCA GCATATCCAG TGAGTTTTGA CAGTGTGTTT ATGTGGAATG 1300 

20 

TTT AAGGA.T A TACAATTGTA CTTTATATAA ATTGGTTCTT GTTCTTCTTA AATGTGACAT 1360 

GAAATAATTG TGCTGCTACA TTATA.CTGGA AATTAACAGG GGAAAAGGGA AGAGCTCTTG 1920 

25 GCTCCCTTGA GGTTCTGCTA GTGGTGTTAG GA.GTGGTTA.C AACTGAGCTT TTAGTAACCA 1930 

TTTAACCGTA. TGTAAACTTG GTTTCTAATT -AAAAAAAAAT TTCTTTTTCC A. 2031 



30 



(2) INFOEfcEATION FOR SEQ ZD MO : 33: 



(i) SEQUENCE CHAFACTS3J1 STIC 3 : 
35 (A) LENGTH: 971 base pairs 

(3) TYPE: nucleic acid 

(D) TO POLCGY : 



40 (xi) SEQUENCE DESCRIPTION: SEQ. ID NO: 33: 

CGCGTCGGAA CTCGGCCGCG GGACATCCAC GGGGCGCGAG TGACACGCGG a-JGGGAGAGC 60 

AGTGTTCTGC TGGAGCCGAT GCCAAAAACC ATGCATTTCT TATTCA.GATT CATTCTTTTC 120 

45 

TTTTA.TCTGT GGGGCCTTTT TACTGCTCAG AGACAAAAGA AAGAjGGAGAC CACCGAA.GAA 130 

GTGAAAATAG AAGTTTTGCA TCGTCCAGAA AACTGCTCTA AjGACAAGCAA GAAGGGAGAC 240 

50 CTACTAAATG CCCATTATGA CGGCTACCTG GCTAAAGACG 'GCTCGAAATT CTACTGCAGC 300 

CGGACACAAA ATGAAGGCCA CCCCAAATGG ■ TTTGTTCTTCT GTGTTGGGCA AGTCATAAAA. -350 

GGCCTAGACA TTGCTATGAC AGATATGTGC CCTGGAGAAA AGCGAAAAGT AGTTA.TACCC 420 

55 

CCTTCATTTG CATACGGAAA GGAAGGCT AT ' GCAGAAGGCA AGATTCCACC GGATGCTA.CA. 4S0 

TTGATTTTTG AGATTGAACT TTATGCTGTG ACCAAAGGAC CACGGA.GGAT TGAGACATTT 540 

60 AAACAAATAG ACATGGACAA. TGACA.GGCA-G CTCTCTAAAG CC3AGATAAA CCTCTACTTG 600 
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CAAAGGGAAT TTGAAAAAGA TGAGAAGCCA CGTGACAAGT CATATCAGGA TGCAGTTTTA ' a 60 

• GAAGATATTT TTAAGAAGAA TGACCATGAT GGTGATGGCT TCATTTCTCC CAAGGAATAC 720 

AATGT AT AC C AACACGATGA ACTATAGCAT ATTTGTATTT CTACTTTTTT TTTTTAGCTA 730 

TTTA.CTGTAC TTTATGTATA AAACAAAGTC ACTTTTCTCC AAGTTGTATT TGCTATTTTT S40 

C CCCT ATG AG AAGATAiTiT GATCTCCCCA ATACATTGAT TTTGGTATAA TAAATGTGAG 9Q0 

4 

GCTGTTTTGC AAACTTAAAA r^AAWWAAA AAAACTSGAG GGGGGCCCGT ACCCAAXOTCG 960 

CCGNATATGA T 971 



(2) IMFORMATION FOR SEQ - ID NO : 34: 

{ i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 1792 case pairs 
(3) TYPE: nucleic acid 

(C) STPAMDEDNES S : double 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 34: 



GAACCCCCTT TCTCCTGGTA AAGGGTAAGG GGGGGGATAA TGTTTACCAC A3GTACGAAA 50 

TAGTCACTTT AACATTGAGA CCTCTGCCTC ATTGAATTCA GGTTTTTTAA GTACTTGAAA 120 

CTCTTCAGAT TCTCCTTATT TTAGTTTCTT TTTACATTTA TGAAGTAGAA AGCATTGTTT 130 

TGTAAACTGT TTTGAAAATA AATAGCCTAG TCTCTTATCC TCTTTAGCGT GGATTAAAGG 240 

TGAAGTTCTG CAAATGGGAG AGTGTTCACA GTAGATAGCT CAGATTGATT GAACACATTT 300 

GAGGAAGAGA CTCCTGCATG AGATACGAGC ATTTTTACAA ATACTTTTTA TGTACATTCT 360 

TTATTTTGTC ATTTTGTCAA CCCTCTCCCC AAGCACATCT TCTTTCCTTT TACTATGTCT 420 

ATGTAGGGAA AAACAAAI\CA AAAAATTGCA CTTACGTTAC ACTCCCAAAA TGTGGGTAAT 430 

CCGTGTCTTT CAAAAAACAT TTCTGTTTTT TGTTTTGTTT TGGTCAGTCC ATTGCATAAG 540 

TGACAAGTTT GGGTGCTTGT GGCACGTATG TATGAAGCGG GAGGGGGATG A3AATTGCCT 600 

GTCCTTCAGT APJGCTGTAAA AGTAATTTAC ATGTAAGTAA AAAGGGAAAA TAGAATAGAT 560' 

GCCAAAGTCA 'TTTATTCAGT CCTTAGTTTT CTTATGTGGC ATTACTGCAT CTGCTAGTTA 720 

GTGAGAAAGC ACCCTCAGCT TTTACTGCTC CCCTCCCTGC CTCOCAACAC ACTTGATGTG 730 

TGCAAACAGC C CTCAAGT AT CTGTCAGATG ACCTATATAA GGTATTGAAT AAGGTATTCT 840 

TGTCAGTTTA GAAATGGACT GGATAAAACT TACTTGGTTG TCATTATTTT ATCTCATTTG 900 

TCCTGTTACA TGCCCTATGT TA-.GAT.-ATT AT ATT GG CAC TAATAATCAA GATGCTAAAT 960 
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GACTATTACA ACTGGCTAAT ATCATTTTTT ATATACAAGG GTATGTGTAT ATTTGGAATT IQ20 

GRTATGAGAA ACTCATTTGT ACCCATTTGA GTGATATTGC ACLAACAAACA CAGATAYCTA 1Q8Q 

CAGACTCCGT TTTCATTTIC TCGTGTTCTT TATGATAATG ATCTTTGTAG ATTGGTTATT 114 Q 

TCTGTAGTTT ATCTGTAATA AACTTTGTAG ATCCTGTGAA. CCATTACTTT GCCTAAATCA 1200 

CTTGAGACTT s GAGTGTTTAA TAACAAAGCA TCAATATTCA CTAAAGTCAA TCTCTTTTGA • I2S0 

GTTTCTGTGA CTTGGCTAGA AGCTCTTGAC ACTAAGGGAT TAGTGTTAAT TTTCCGTGGG 1320 

GGTGTTCCAC TAGGGCATTA CTGTATAATG ACT TG ATGTT GGCACATAGA CTTCAAGA.TA 1330 

15 TATAATATTT TGAGGATTTT GTTGATTGGC CTATGTTTTA TTGGATAGTG TGAAACGTGT 1440 

AAAGGTTGGT TAACCTGTAT ATAGATAGCT TATTGTTGAC TAGTTATAGT GTATTTAGGG IS 00 

TTGCCTGTAA TATTTAAGCT TCTTTACTGA TGTGTGTGCT GGTAGGAACA. TATAATTTTT 1360 

20 

GTACATTATA TTTAGTGAGA TGTTGGCTTT TTTATTTTAC AAATACTTTG GAATTCCAAT 1520 

GTGTTTTTTG CTTCCGTGAG CATTAATTTG GAAAGGTTTT TAATGACATT CCACTGATTT 1580 

25 GAGATTTTGC TTGAGATTC-A CTTCAATAAA 'TTGTCCTGTA TGTTCCAAAA AAAAATTAAA 1740 

AAACTGGAGG GGGGCCCGOT ACCCAANNCG 6CGGATATGA TCGTAAACAA TC 1792 



30 



60 



(2) INFORMATION FOR SEQ ID MO: 35: 



< i ) SEQUENCE CHARACTERISTICS : 
35 (A) LENGTH: 396 base pairs 

(3) TYPE: nucleic acid 
(G) STPANDEDMESS : double 
(D) TOPOLOGY: linear 

40 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 35 : 

AGTTGNAMAC AACAGGACCT GAGTCCTTGG GCAGCACCAG TAGGTTGCCC CYTGCYTCYT 60 

GCCAGCYTCA CTTGCCACYT TYTGCCCCTY TCGGGATGCC TTCGCAGACA GAGYTYTTCG 120 

45 

CTGCCTGTGG TGGCCAYTCT 'TTGCTTTTGG TTYTCTTGCC CCTTGGCCTC CCTTTTTGTC 130 
CCCGGGCAGC CTTGTGTGAC CTGCCCTTTT CCCTCCCTTC CTTTCCAGGA CAAGCACGCC " 240 

50 GAGGAGGTGC GGAAAAACAA GGAGCTGAAG GAAGAGGCCT CCAGGTAAAG CCTAGAGGCC 300 

AAAGAACTTT CCAGGTCAGC CGGACAGCTC CAGCAGCTCC AC GTTCC-.GG CA.GCCTCGMC 360 

IGCCGGCTGC GCTCCCAGCA CTGGGGTTTG GGGGGAGGGG GGTGGCCAAG GGGCGTTTCC. 420 



TCTGCTTTTG GTGTTTGTAC ATGTT AAGAA TTGACCAGTG AAGCCATCCT ATTTGTTTCC 480 
GGGGAACAAT GACGGGGTCG GARAGGGGAG AGGAGAGAGT TTGGGAAA.G3 GAGATGGAGA 540 
AGAi.CTCAAG GACATTGCAA CCCTGGCCGG CGCAGATCTG ATTTTC AC AT CTCTACCTGG 500 
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ACATTGAGCC TCCCAGGCAC CATGTTGAGG AGAGATGAAA ACCAGGGCGG TAGAA.CTTCA 
GGGTGAAGGA CAGAGTCCTG GGTGGGGCAG CGGCTGCAGG GCGCACCAGA GAACCCAGCC 
AGAGGGGGTG TGAGTACCAG TGGTGTTGCT TCCACCCTGC AGC-GGTGGG ATGAGGTCTG 
TGTGTGTGTG TGAftGCaSCA TTTTTTGATC ATCATGACCA ATGAAAO.TT GAAAAXAAAA 
AAAAAAACTG GAGGGGGGCC CGTACCCAAN TCGCCGMATA GTGATCGTAA ACAATC 

(2) INFOFKAT ION FOR SEQ ID MO: 35: 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 912 base pairs 
(3) TYPE: nucleic acid 
(C) STRANDEDMESS-: double 
CD) TOPOLOGY : linear 

(:<i ) SEQUENCE DESCFJCPTION: SEQ ID MO:- 36: 

TCGACCCACG CGTCCGGTCA GCCAGTCGCA TCCAGCCATG ACAGCCTTCT GCTCCCTGCT 

CCTGCAAGCG CAGAGCCTCC TACCCAGGAC 'CATGGCAGCC CCCCAGGAGA GCCTCAGACC 

AGGGGAGGAA. GACGAAGGGA TGCAGCTGCT ACAGACAAAG GACTCCATGG CCAAGGGAGC 

TAGGCCCGGG GCCAKCCGCG GCAGGGCTCG CTGGGGTCTG GCCTACACGC TGCTGCACAA 

CCCAACCCTG CAGGTCTTCC GCAAGACGGC CCTGTTGGGT GCCAATGGTG GCCAGCCCTG 

ARGGCAGGGA AKGTCAACCC ACCTCCCCAT CTGTGCTGAG GCATGTTCCT GCCTACCATC 

CTCCTCCCTC CCCGGCTCTC CTCCCAGCAT CACACCAGCC ATGCAGCCAG CAGGTCCTCC 

GGATGACYGT GGTTKGGTGG AGGTCTGTCT GCACTGGGAG CCTCARGAr.G GCTCTGCTCC 

ACCCACTTGG CTATGGGAGA GCCAGCAGGG GTTCTGGAGA AAAAAACTGG TGGGTTAC-GG 

CCTTGGTCCA. GGAGCCAGTT GAGCCAGGGC AGCCACATCC AGGCGTCTCC CTACCCTGGC 

TCTGCCATCA GCCTTGAAGG GCCTCGATGA AGCCTTCTCT GGAACCACTC CAGCCCAGCT 

CCACCTCAGC CTTGGCCTTC ACGCTGTGGA AGCAGC CAAG GCACTTCCTC ACCCCYTCAG 

CGCCACGGAC CTYTYTGGGG AGTGGCCGGA AAGCTCCC 30 GCCTYTGGCC TGCAGGGCAG 

CCCAAGTCAT GACTCAGACC AGGTC C CAC A CTGAGCTGCC CACACTCGAG ACCCAGATAT 

TTTTGTAGTT TTTATKCCTT TGGCTATTAT GAAAGAGGTT A3TGTGTTCC CTGCAATAAA 

CTTGTTCCTG AG / 



(2) INFORMATION FOR SEQ ID MO: 37: 
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POYUS9S/11422 



ti) 



CHARACTERS STIC 3 : 



10 



15 



20 



25 



35 



40 



45 



50 



55 



(A) LENGTH: I3S2 base pairs 

(2) TY?E; nucleic acid 

(C) STRANDEDMES3 : double 

(D) TGFOLCGY : linear 



(xi) SZQU 



OPTION: SEQ ID MO : 37: 



AATTCGGCAC GAGCGGAGGC GAGCGAAACT RftGGGCGAAA GTTGTGTGTC 
GAGGGCCTAG AAGGGAAAGA CTGTCTAGTG GGACAA.TGTC ATA.TTA.TAAA 
TGAATAGAAA ATTATAGATT TTGATATTGA AGGAAA.TGAA - GCGAAGCYTA, 
CAGCTCGAAG TAOJGCAGGC TGTTTGCCTG TTCCGTTGTT CAATCA.GAAA. 
GACAGCCATT AACTTCTAAT CCACTTAAAG ATGA.TTCACG TATCAJGTA.CC 
ATTATGATTT TCCTCCTCTA CCTACAGATT GGGCCTGGGA AGCTGTGAAT 



CTCCTGTAAT GAAAACAGTG GAC^£CGCGC AAATAC CAC A. TTCAGTTTCT 
GAAGTCAAGA TTCTGTCTTT AACTCTATTC AA.TCAAA.TAC TGGAAGAAGC 
GGAGCTA.CAG AGATGGTAAC AAAAATACCA. GCTTGAAAAC TTGGRA.TAAA. 
AGCCTCAATG TAAACGAA.CA AACTTAGTGG C.^JKTG^.TOG AAAAAATTCT 
GTTCGGGAGC TCAACAACAA AAACAA.TTAA GAACACCTGA ACCTCCTAAC 
ACAAAGAAAC CGAGCTACTC AGACAAACAC ATTCATCAAA AATA.TCTGGC 
GAGGGCTAjGA caaaaacagt GCA.CTAO.GA CACTTAAGCC CAA.TTTTCAA. 
ATAAG.AMACA AATGTTGGAT GATATTCCAG AAGACAACAC CCTGAA.GGAA 
ATCA.GTTACA GTTTAA.GGAA AAAGCTAGTT CTTTAAGAA.T TATTTCTGCA 
GCATGAAGTA TTGGCGTGAA CA.TGCACA.GA. AAACTGTACT TCTTTTTGAA 
TTCTTGATTC AGCTGTTACA CCTGGCCCAT ATTATTCGAA GA.CTTTTCTT 
GGAAAAATA.C T CTGCCTTGT GTCTTTTATG AAATCGATCG TGAACTTCCG 
GAGGCCGAGT TCATAGATGT GTTGGCAACT ATGACCAGAA AAAGAACATT 
TTTCTGTCAG ACCGGCGTCT GTTTCTGAGC .AAAAAACTTT CCAGGCATTT 
CAGATGTTGA GATGCAGTAT TATATTAATG TGATGAATGA AACTTAAGTA 
GAA.GTTTAGC AT.AAATTATA. GCAGTTTTCT GTTATTGCTT AATTTACCAT 
TTATAGCTAC TATTGTATTT CACTTGTTGA ATTAAAGTAT TTGAATTCTT 



GTGTTGGCAG 
TTTGGAATGC 
AATGAAAATT 
AAGAGGAA.CA 
CCTTCTGACA. 
CCAGAGTTKG 
CGTCCTCTGA 
CAGGGTGGTT 
AA.TGA.TTTTA. 
TGTCCAATGA 
TTATCTCGCA 
TGCACAA.TGA 
CAAAATCAAT 
ACCTCATTGT 
GTTATTGAAA 
GTA.TTAGCTG 
A.TGAGGGATG 
Aj3A.CTGA.TTA 
TTCCAATGTG 
GTCAAAATTG 
GTGATAAAAG 
CTCCATAGTT 
TTAAAAAAAA 



50 
120 
130 
240 
300 
360 
420 
430 
540 
600 
660 
720 
730 
340 
900 
960 
1020 
1030 
1140 
1200 
1250 
1320 
1330 
1332 



60 
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298 



(2) J2IFOKMATICN FOR SEQ ID MO: 33: 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 372 base pairs 
(3) TYPE: nucleic acid 

(C) STEANDEDNESS : double 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 33: 

GGGCTACTTC AAAGCCCTGG GCCTTATTTC TTCAGGTAAA AAAATATAAA GTCAGATCTC SO 

ATCCCGGCTG GCCATGCTGT TAjGACCCTTT CATCCTTCTC TTCTGCCTCT TCTCAAOACC 12 Q 

15 TGCCCAGTCC TGTTTGGAAT TCATATACAT ACAGTTCTAA TACTGATGTA TTTACCCTCA 130 

TAAGCCACTC AA2CCAGAAT CTTATTTGAA TTATAATCCA GAAACATCAG GTGACGTGTG 240 

AGACTACTGT ATGAGAAAGA GACAGTTTAA GGGTCAGTCC AATGGAAAAA AGAGTTCTCA 300 

20 

GAGCTTTCTT TAGCTTATTC TCATCAAAGA GCTTTCTCTG CAGAAGGAAC CTACTGGTTC 360 

CTCCTTTCCA GTCCTAGAAA TCCTGACCTA. GAGTGGCTTA ATCCTGCTAG CACCTCTCTC 420 

25 TCGCACTCTG GTGCCAAATG ACTCCAGGAA CTGGGCCATG ATGTGGTGGG AATGACCTTA 480 

CCCTGAGCAT GTCACTCATG CATTGAACAA. ' CAjGCTAAGAG CAGAGCTTAG AGCTTAGAGC 540 

TGGGCCCTGT AAGGTGAGAG GAATCACATC CTGCAGAAGT CTGTCCTGAG AAGCAGGTAC 600 

30 

TCCTGTCACA GCAGAGACAC AGTGGATACC TGAGTAACAA TAA.TACAAGA CAGGACGTGG 660 

GMACAGCAAA AGATTTGGGT GTCAGAAGAR GCCGAGAACA CTT" f CAGGCA GGAACATTCA 720 
35 HARTTGTTCT TGGAGGAART AGGCMC 3 AAG GCTGGGCAGG ATTTCMCG3G GCAGAGATGG ' 780 

AGGAAGCAAT TGAAATGAAA GCCATGGCAT GGGAAAAGGA GCACTGGCCA CAGGGAGTGC 34G 

3A TGCAAGGCCA CTGTGGAGCC AT 372 



40 



45 



(2} INFORMATION FOR. SEQ ID NO: 39: 



(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH : 312 base pairs 

(B) TYPE : nucleic acid 

(C) STFANDEDNE53 : double 
50 (D) TOPOLCGY: linear 

(xi) SEQUENCE DESCP-IPTION : SEQ ID NO: 39: 

GGCAGAGGCT CACCCCAGCA GAGATTGAGG GGGAACCGTG ATGAAATTTT TAAGTATTCT 50 

55 

GCTTGATGA.T AA.TAATTTTY CTCTTATGTT AATGTTGGCT CCGTTTGGGT GTTTAGCTTT 120 

TGAAAGGAGT ATGAAAATGC GGAATGGGGC TTTGGGGCTT GAGGAGGTGT GATCTCTA.GT 130 

60 GTTTAAAAAA TTTAATTGCA CAAATAGAAA TAA.TTCACCC ACATTATTGA ACCCCACTAA 240 
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AGCATATCCT TTTTGTCCAT ATTCCTTTCC TGCTGCCCTC GTGTGTACCA TTATTACTCA 
GTTGTGATTT GAGCTCGTTC O-CTTAAAGT CATTCATAGA TACTTTTGCG TCGTGTTKGA 
ATA-TTTATTC- aATTTCTATT CTGTGTTTTA GTTAATTACT TTA.TTATGGA AC CTTTACAC 
AGGTCTGGTG TACXTGTTCT TTGAAAAGTC TTATGTTGAC CACCATCACT GAGCATATAG 
CTTTTTCCTT ATTTCCTTGG GATAATTACC CGAAGTGGAA ATACCGAATC AAACTTCTGT 
TTTCT T TCTT TGGCACTATT ATATAAATTG TTTTCCAAAC AAGGCATGTT TACAATAGAC 
ATTTTTCAAA ATCTGGGTAT TTGTCCTATT TTGCTCTCTG TATGCAGAAT TCAGCGGGGT 
GCCAAGTCGT TTTCTGTGTG GGTTGAGAGA CAGGCTGTGC AGCCCACTGT TGCATAGGAC 
TAACTACTAC AAATCA.TGCT GAGACCGAGC TATTTTTGCT GCTTAGARGC TTTGCAGCCT 
TGAGTAAGTT TCGNCATCTG GAAAOJTTC-N AA 

(21 INFORMATION FOR SZQ ID WO: 4Q : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH : 15 15 base pairs 

(B) TYPE : nucleic acid 

(C) STPAMDEDNE3S : double 
(DJ TOPOLOGY* : linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID MO: 40: 

AATTCGGCAC GAGGGAAATT CAAGCACTTT TCCTAAAAGA AGGGGGAA'TG GATGC TGAAA 

CAACACGTfcIT CCCACAAAGG GAGCAGACAC TGGGCTTGTG AAGCTGCCCC ATACCTTCCC 

CACAGAACTG GGGTCCGGCC TCCCTGACAT GCAGATTTCC ACCCAGAAGA CAGAGAAGGA 

GCCAGTGGTG ATGGAATGGG CTGGGGTCAA AGACTGGGTG CCTGGGAGCT GAGGCAGCCA 

CCGTTTCAGC CTGGCCAGCC CTCTGGACCC CGAGGTTGGA CCCTACTGTG ACAO^CCTAC 

CATGCGGACA CTCTTCAACC TCCTCTGGCT TCCCCTGGCC TGCAGCCCTG TTCACACT--.C 

CCTGTCAAAG TCAGATGCCA AAAAAGCCGC CTCAAAGACG CTGCTGGAGA AGAGTOVGTT 

TTCAGATAAG CCGGTGCAAG ACCGGGGTTT GGTGGTGACG GACCTCAAAG CTGAGAGTGT 

GGTTCTTGAG CATCGCAG-CT ACTGCTCGGC AAAGGCCCGG GACAGACACT TTGCTGGGGA 

TGTACTGGGC TATGTCACTC O.TGGAACAG CCATGGCTAC GATGTCACCA AC-GTCTTTGG 

GAGCAAGTTC ACACAGATCT CACCCGTCTG GCTGCAGCTG AAGAGACGTG GCCGTGAGAT 

GTTTGAGGTC ACGGGCCTCC ACGACGTGGA CCAAGGGTGG ATGCGAGCTG TCAGGAAGCA 

TGCOiAGGGC CTGCAC^TAG TGCCTCGGCT CCTGTTTGAG GACTGGACTT ACGATGATTT 
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CCGGAACGTC TTAGACAGTG AGGATGAGA.T AGAGGAGCTG AGCAAGACCG TGGTCCAGGT 340 

GGCAAAGAAC CAGCATTTCG ATGGCTTCGT GGTGGAGGTC TGGAACCAGC TGCTAAGCGA 300 

5 GAAGGGCGTG ACCGACCAGC TGGGGATGTT CA.CGGACAAG GAGTTTGAGC AGCTGGCCCC 960 

CGTGGTGGAT GGTTTCAGCC TCATGACCTA CGACTACTCT ACAGCGCATC AGCCTGGCCC 1020 
TAATGCACCC CTGTCCTGGG TTCGAGCCTG CGTCCAGGTC CTGGACCCGA AGTCGAAGTG " i 080 

10 " 

GCGAAGCAAA ATCCTCCTGG GGCTCAACTT CTATGGTATG GACTACGCGA CCTCCAAGGA 1140 

TGCCCGTGAG CCTGTTGTCG GGGCCAGGTA CATCCAGACA CTGAAGGACC ACAGGCCCCG 1200 

15 GATGGTGTGG GACAGCCAGG YCTCAGAGCA CTTCTTCGAG TACAAGAAGA GCCGCAGTGG 1250 

GAGGCACGTC GTCTTCTACC CAACCCTGAA GTCCGTGCAG GTGCGGCTGG AGCTGGCCCG 1320 

GGAGCTGGGC GTTGGGGTCT CTATCTGGGA GCTGGGCCAG GGCCTGGACT ACTTCTACGA 1330 

20 

CCTGCTCTAG GTGGGCATTG CGGCCTCCGC GGTGGACGTG TTCTTTTCTA AGCCATGGAG 1440 

TGAGTGAGCA GGTGTGAAAT ACAGGCCTTC ACTCCGTTAA AAAAAAAAAA AAftAAAAAAA 1500 

25 AAAAAAAAAA AAAAA . 1515 



30 (2) INFO RMAT ION FOR SEQ ID MO : 41: 

(i) SEQUENCE CHARACTERISTICS: 

. (A) LENGTH: 704 bass pairs 
^ (3) TYPE: nucleic acid 

35 • (C) STSANDEDNE5S : double 

(D) TOPOLOGY: linear 

(Xi) SZQUEZ1CZ DESCRIPTION: SEQ ID NO: 41: 

40 AAGATGGTGG CGCCCAGAGC TTCGCTCTAT GCTGCTCCCC TGAGAGAGGC C-TTTC CATCA 60 

ACCAGTTTTG CAAGGAGTTC AATGAGAGGA CAAAGGACAT CAAGGAAGGC ATTC'CTCTGC 120 

CTACCAAGAT TTTAGTGAAG CCTGACAGGA CATTTGAAAT TAAGATTGGA CAGCCCACTG 130 

45 

TTTCCTACTT CCTGAAGGCA GCAGCTGGGA TTGAAAAGGG GGCCCGGCAA ACAGGGAAAG 240 
AGGTGGCAGG CCTGGTGACC TTGAAGCATG TGTATGAGAT TGCCCGCATC AAAGCTCAGG ' 300 

50 ATGAGGCATT TGCOCTGCAG GATGTACCCC TGTCGTCTGT TGTCCGCTCC ATCATCGGGT 360 

CTGCCCGTTC TCTGGGCATT CGCGTGGTGA AGGACCTCAG TTCAGAAGAG CTTGCAGCTT 420 

TCCAGAAGGA ACGAGCCATC TTCCTGGCTG CTCAGAAGGA GGCAGATTTG GCTGCCCAAG 430 

55 

AAGAAGCTGC CAAGAAGTGA COCTTGCCCC ACCAACTCCC AGA.TTTCAAA GGAGGTAGTT 540 

GCAAAAGCTG TGCCCAAGGG GAGGAAGGAG GTCACACCAA. TATGATGATG GTTTTCATGA 6GQ. 

60 CTTTGAATGA TATATTTTTG . TACATCT AGO TGTATCGAGG CATCAGGCCT GAATAAACAT 550 
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CCTTTCTTAA AAAAAAAAAA AAAAAAAAAA AAAA 704 

5 

r 

(2) INFORMATION FOR SEQ ID NO: 42: 

(i) SEQUENCE C-IAPACTESLISTICS : 
10 (A) LENGTH: 1094 base pairs 

(3) TT?E: nucleic acid 
CO STRANDEDNE33 : dcubie 
CD J TOFOLCC-i": linear 

15 <xi) SEQUENCE DESC^XSTJON : SEQ ID NO: 42: 

C^CAGCTTTC TTACAAACCC ATCCTTCTGA AATGTTGCTT CAAATTCATC CTCTGCTCCC 60 

CAGTCCCACT ATTCCACACA TACTGTTACT GTTTCTTTAT CCTACTTTCT CAATTTTGGA 120 

20 

ACATAGTTGC AGTTACTGCA TTGAATACCT GTGGGTTTGC CTGTTGTTCT GTCTGTCTCT 130 

GTGGTTCTTG TAATANTGGA TCCCAGAGAT AAAATGGACA GTTGTMATGC ACA.GTTAATT 240 

25 ^ CAGAAACTAG ACCTT ACTTG CTGTGTGAAA TACCAACTAA ATTCTCAGTG AACTCAGCTG 300 

AMCTTTATCT CCTTTTGTTT CCCCAATTTA TAATTTCAGT TCAGGCCCAG AAAGATGGAA 350 

TCCCAGCTAA GAAATACAAG TTACACCCTG TACTAGCAGC CCATGTGTGC ATGTTCTTTA 420* 

30 

AGTGCTCTTG CAGCTATGTG ATTTATATTG ATTTCCCTGT ATTATTATAA GCAAAGCAAA '430 

TTTGAGGAAA AAAACC CAT A ATACCACACC TC ATTTTTTT CAAGTAATAG GGTCATAAGT 540 

35 CTCArrCTi'C ATATAATATG TTGAGTATGC AGTATATTAT GTGTTAGGCT CTGGANAGGC 500 

AGAGGTTAGA TCATGTWACA GATQATATCK GATTAGGCAG ATAAACAGTA TTTTAACCTT 560 

TTCCTTATTA TATGTAACTT GCTTTCAGGT TTTTTAATGT TACTATTATG TCTTTAATA.T 720 

40 

ATTATCTTTA TTTGTACTTT TGTATACAGA GTCATTTTCC TTTTTTAAAA AAAATTGTGT ' 730 

CTTTAGGATG GATTCCAAAG ATGTGGAATC AGTAGGTTTA AGGAATATGG ATATTTTGGC 340 

45 TGGCAAGGTG GCTCACACCT GTAATCCCA.G CACTTTGGGA GGCTGAGGTG GGTGGATCAC 900 

CTGAAGTCAG GAGTTCGAGA CCAGCCTGAC CAACATGGCG AAACCCTGTT TOTACTAAAG 950 
ACACAOvWAA AATTPJGCCAG TGGTGGTGGC ATGTGCTTGT AGTCCCACTT AGCTACTCGA . 1020 

50 

GAGGCTGAGG CAGGAGAATC GCTTGAACCC GGGAGGCAGA GGTTGCAGTG AGGCAAGATG 1080 

GCACCTCTAC ACTC 1094 

55 

(2) INFORMATION FOR SEQ ID NO: 42 : _ 
60 (i) S-ZQUEXCZ CHAFACTEHISTIC5 : 
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•'.-.}" 12_::G7^: 1.21 rase pairs 
'3> TY7Z: nuclei;: acid 
•C) ST7^ITZ^MZS3 : double 
; 3 ) 7C TGLCGY r linear 

5 

>:zL) SZZJSHsZZ :i3C?^rTIuM: S£Q ID MO: 42: 



TGC-GTTAGGG CATCACGCTT CCCTTGGGTG GAACTACTGG ACAGACCCTT TTGAGATGTG 60 
10 CCTGT GG7GC 7G7GCAGATG 7G7GTAG7GC- TCTTAGCTCT TTGTTGAGCT TGTGTGTGTG 120 



TTG7GTAG7G 7TAG77GTA7 C<^GAiATTG GGCGTGTGTT GGAGGGGTTG TTAGCTCTTT 130 
GGTGAJGATTG 7A7T7GTATG 7GTTTGTATC ASCTGAATGT TGCTGGA^AT AAAACCTTGG . 240 
TTTGT^-AGG GTGVTTTTTG 7GG-G-AGTAA GTAGGGGAAA AGGTCTTTGA GCGTTCCTAG 300 



GcrcrrrrGT ac-agaggaa aatgcctca-- acogttgctt cccagcaact tgggggtggt 3 so 

20 TCCCAGTGCC TGG77C7GCG CCTTC CTGGT TCTTATCTCA AGGCAGAGCT TCTGAATTTC 420 



AGOG 2TTGAT TC GAGAGCCC TCTTGTGGCC AC-GGCTTCCT TTGGTGGAGG AAGGTAGAC-. 430 

GGG7GAAG7T GA7GCTGTAC TTGGGGGATG TCCTTGGCCT GTTCCACCAA GTGAGAGAAG 540 

25 

GTACTTAC7G 7TG7ACCTGC TGTTGAGCCA GGTGCATTAA CAGACCTCGG TAGAGGTGTA 600 

GGAAG7ACTG TG 7 G-JGAGGT GAGGCAAGGG GATTTGTGAG GTCATTTGGA GAACAAGTGG 660 

30 TTTAGTAG7A GGTTAArJGTA GTAAC7GCTA CTGTATTTAG TGGGGTGGAA TTCAGAAGAA 720 



ATTTGA-GAG GACA7GATGG GTGGTC7GGA TGTCAATGAA CAGGAATGAG CCGGACAGCC 730 

TGG-GTGTCAT 7GG77TCTTC CTGCCCATTT GGACCGTTCT CTGCCCTTAC ATTTTTGTTT 340 

35 

CTGCATC7AG TACGATGCAC GAGTGTATTT ATTAAGTTAG CAAGAGGACA AGTAAhGGGG 900 



CCTGTTGG3T TGA7TTTGCT TCTTTGTTTC TGTGGAGGAT . AT ACT AAGTG CGACTTTGCC S60 
40 CTA.7CGTA7T TGG.-A_i.TGCG TAACA'GAATT GAGTTTTCTA TTAAGGATGC AAPAAGAAAA 1020 



ACAAAATG7T AA7GAAGCCA TCAGTGAAGG GTGACATGGC AATAAACAA.T AAATTTTCCA 1080 



GAAjGAAATGA. AA7CGAAGTA GACA^ATA^A GTAGAGCTTA TGAAATGGTT CAGTArGGAT 1140 

45 

GAG7TTG7TG iTTTTTGTTT TGTTTTGTTT TGKTTTTTTA AAGAGGGAGT CTCGCTCTGT 120Q 

CACTCAGG7T GGA7TGCAGT GGTATGATGT TGGGTCACTG TAACCTCCGG GTCGCGGGTT 1250 

50 O-AGCCATTC TCC7GGCTGA GTGTGGTGAG TAGGTGGGAT TACAGGTGCG TGCCACCATG 1320 



CCTGGCTAAT TTT7GTGTTT TTAGTAGAGA CAGGGTTTGA CGATGTTGGT CGGGCTGGTC 1330 



TGAAACT7CT GA7CTGTTGA TCCGCCTGCC TTGGGCTCCC AAAGTGATGG GATTACAGAT 1440 

55 

GTGAGGCAGG C G7GGGGT AG OGA^GGATGA GATTTT T AAA • GTATGTTTCA GTTCTGTGTC 1500 

ATGGTTG7AA GACAGAGTAG GAAGGATATG GAAAAGGTCA. TGGGGA-jGCA GAGGTGATTC 1560 

60 ATGGCTG7GT GAA7TTGAG7 TGAA7GG7TC GTTATTGTCT AG7C 7ACTTG TGAAGAATAT 1620 
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GAGTCAGTTA TTGCCAGCCT TGGAATTTAC TTCTCTAGCT TAO^-TGGAC CTTTTGAACT IooO 

C^AAACACC TTGTCTGCAT TCACTTTAAA ATGTCAAAAC TAATTTTTAT A-TAAA.TGTT 1740 

5 

TATTTTCACA TTGAAAAAAA AAAAAAATTT AA-AACYCC-G GGGGGGCCC3 Gi/ACCCCATT 1300 

MGCCCCTAAG GGGGGGGGTT T 1321 

10 



• (2) INFORMATION FOR SEQ ID NO: 44: 

15 (1) SZQUZHCZ CHARACTERISTICS: 

(A) LENGTH: 1024 base pairs 
(3) TYPE: nucleic acid 
(C) STRANDEDME33 : double 
CD) TOPOLOGY: linear 

20 

(:ci) SZQUZXCZ DESCRIPTION: SEQ ID NO: 44: 

GGGGCACAGT TGAAGArtGCG ACCGAGGGAC TGGGAGTCGT TAGTGAGGAT GACGCGGCAT 50 

25 GGCAAGAACT GCACCGCA3G GCCGTCTACA CCTACCACGA GAAGAAGAAG GACACAGCGG 120 

CCTCGGGCTA TGGGACCCAG AACATTCGAC TGAGCCGGGA TGCCGT GAAG GACTTCGACT 130 

GC^GTTGTCT CTCCCTGCAG CCTTGCCACG ATCCTGTTGT CACCCCAGAT GGCTACCTGT 240 

30 

ATGAGGGTGA GGCCATCCTG GAGTACATTC TGCACCAGAA GAAGGAGATT GCCCGGCAGA 300 

TGAAGGCCTA CGAGAAGCAG CGGGGCACCC GGCGCGAGGA GCAGAAGGAG CTTCAGCGGG 360 

35 CGQCCTGQCA GGACCATGTG CGGGGCTTCC TGGA3AAGGA GTCGGCTATC GTGAGCCGGC 420 

CCCTCAACCC TTTCACAGCC AAGGCCCTCT CGGGCACCAG CCCAGATGAT GTCCAACCTG 430 

GGCC CAGTGT GGGTCCTCCA AGTAAGGACA AGGACAAAGT GCTGCCCAGC TTCTGGATCC 540 

40 

CGTCGCTGAC GCCCGAAGCC AAGGCCACCA AGCTGGAGAA GCCGTCCCGC ACGGTGACCT 600 

GCCCCATGTC AGGGAAGC CO CTGCGCATGT CGGACCTGAC GCCCGTGCAC TTCACACCGC -660 

45 TAGACAGCTC CGTGGACCC-C GTGGGGCTCA TCACCCGCAG CGA3CGCTAC GTGTGTGCCG 720 

TGAGCCGGGA CAGCCTGAGC AACGCCACCC CCTGOGCTGT GCTGCGGCCC " TCTGGGGCTG 730 
TGGTCACCCT CGAATGCGTG GAGAAGCTGA TTCGGAAGGA CATGGTGGAC CCTGTGACTG ■ 340 

GAGAGAAACT CACAGACCGC GAC AT CAT CG TGCTGCAGCG GGGCGGTACC G3TTCGCGGG 900 

CTCCGGAGTG AAGCTGCAAG CGGAGAAATC ACGGCCGGTG ATGCAGGCCT GAGTGTGTGC 960 

55 GGGAGACCAA ATA AACCGGC TTGGGTGCGG AAAAAAAA^A AAAAAAAAAA PAAAAAAAAA 1020 

AAAA ^ 
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(2) INFORMATION FOR 5EQ ID MO: 45:. 

. ... _ CD SEQUENCE. CHARACTERISTICS :. . ... . 

5 (A)* LENGTH: 933 base pairs 

(3) TYPE: nucleic acid 

(C) STRANDEDNES3 : double 

(D) TOrOLCGY : linear 

10 (xi) SEQUENCE DESCRIPTION: SZQ ID NO: 43: 

CGACACGGCT GGGAGAAGAC GACAGAAGGG CCCGACCGCG AGCCGTCCAG GTCTCAGTGC 60 

TGTGCCCCCC C CA.GAGCCTA GAGGATGTTT CATGGGATCC CAGCCACGCC GGGCATAGGA 120 

GCCCCTGGGA ACAAGCCGGA GCTGTATGAG GAAGTGAAGT TGTACAAGAA CGCCCGGGAG 130 

AGGGAGAAGT ACGACAACAT GGCAGAGCTG TTTGCGGTGG TGAAGACAAT GCAAGCCCTG 240 

20 " GAGAAGGCCT ACATCAAGGA CTGTGTCTCC C CC AGCGAGT ACACTGCAGC ■ CTGCTCCCGG 300 " 

CTCCTGGTCC AATACAAAGC TGCCTTCAGG CAGGTCCAGG GCTCAGAAAT CAGCTCTATT 350 

GACGAATTCT GCCGCAAGTT CCGCCTGGAC TGCCCGCTGG CCATGGAGCG GATCAAGGAG 420 

25 

GACCGGCCCA TCACCATCAA GGACGACAAG GGCAACCTCA ACCGCTGCAT CGCAGACGTG 430 

GTCTCGCTCT TCATCACGGT CATGGACAAG CTGCGCCTGG AGATCCGCGC CATGGATGAG 540 

30 ATCCAGCCCG ACCTGCGAGA GCTGATGGAG ACCATGCACC GCATGAGCCA CCTCCCACCC 500 

GACTTTGAGG GCCGCCAGAC GGTCAGCCAG TGGCTGCAGA CCCTGAGGGG CATGTCGGCG 650 

TGAGATGAGC TGGACGACTC ACAGGTGCGT CAGATGCTGT TCGACCTGGA GTCAGCCTAC 720 

35 _ . 

AACGCCTTCA ACCGCTTCCT GCATGCCTGA GCCCGGGGCA CTAGCGCTTG CACAGAAGGG 730 

CAGAGTCTGA GGCGATGGCT CCTG3TCCCC TGTCCGCCAC ACAGGCCGTG GTCATCCACA 840 

40 . CAACTCACTG TCTGCAGCTG CCTGTCTGGT GTCTGTCTTT GGTGTCAGAA CTTTTGGGCC 900 

GGGCCCCTCC CCACAATAAA GATGCTCTCC GACCTTCAAA AAAAAAAAAA AAAAAAAAGR 960 

KGSGGCCGGT CCCCANTCCC CGC 983 

45 



50 



(2) INFORMATION FOR SZQ ID NO: 4c: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2421 base pairs 
(E) TY?E: nucleic acid 
(C) STRANDEuNESS : double 
55 (D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID MO:. 4o: 

CCGGCTGATC GCTGCCGCTC CGCCAATACA ATAGAGCCAK CCACTACCAG CAGCCTGGCC 50.- 

60 
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CTCTTCCTCC TTCTCCAGA.G AGACCAA.TCC AGCCGAACTC G3C-3TTTGCC TGAGGAGAAG 120 

GAGGAAGTGA CCATGGAOjC AAGTGAAAAC AGA.CCTG.-AA ATGATGTTCC AGAA.CCTCCC 130 

5 ATGCCTATTG CAGACCAAGT CAGCAATGAT G-^CCGCCOGG AGGGCAGTGT TGAAGATGAG 24G 

GAGAAGAAAG AGAGCTCGCT GCCCAAATCA TTCAAGAGGA AGATCTCCGT TGTCTCAGCT 300 

ACCAAGGGGG TGCCAGCTGG AAACAGTGA.C ACAGAGGGGG GCCAGCCTGG TCGGAAAGGA -350 

10 

CGCTGGGGAG CCAGCACAGC C\CCACACAG AAGAAACCTT CCATCAGTAT CACCACTGAA 420 

TCACTAAAGA. GCCTCATCCC CGACATCAAA CCCCTGGCGG GGCA.GGAGGC TGTTG7GGAT 430 

15 CTTCATGCTG ATGACTCTCG CVTCTCTGAG GATGAGA.CAG AGCGTAATGG CGATGATGGG 540 

ACCCATGACA. AGGGGCTGAA AATA.TGCCGG ACAGTCACTC AGGTAGTACC TGCAGAGGGC 600 

CAGG AG AATG GGCAGAGGGA AGAAGAGGAA GAAGAGAAGG AACCTGAAGC AGAACCTCCT 550 

20 

GTA.CCTCCCC AGGTGTCAGT AGAGGTGGCC TTGCCCCCAC CTGCAGAGCA TGAAGTAAAG 720 

AAAGTGACTT TAGGAGATAC CTTAACTCGA. OGTTCCATTA "GCCAGCAGAA GTCCGGAGTT 730 

25 TCCATTACCA TTGATGACCC AGTCGGAACT GCCCAGGTGC CCTCCCCACC CCGGGGCAAG 340- 

ATTAGCAACA. TTGTCCATAT CTCCAATTTG GTCCGTCCTT TCACTTTAGG CCAGCTAAAG 900 

GAGTTGTTGG GGCGCACAGG AA.CCTTGGTG GAA.GAGGCCT TCTGGA.TTGA CAAGATCAAA. 950 

30 

TGTCATTGCT TTGTAACGTA CTCAA.CAGTA GAGGAAGCTG TTGCCACCCG CACAGCTCTG 1020 

' CACGGGGTCA AATGGCCCCA GTCCAATCCC AAATTCCTTT ,GTGCTGACTA, TGCCGAGCAA 1080 

35 GATGAGCTGG ATTA.TCACCG AGGCCTCTTG GTGGACCGTC CCTCTGAAAC TAAGACAGAG 1140 

GAGCAGGGAA TACCACGGCC CCTGCA.CCCC CCACCCCCAC CCCCGGTCCA GCCACCACAG 1200 

C^C^C^C^-GG CAGAGCAGCG GGAGCAGGAA CGGGCAGTGC GGGAACAGTG GGCAGAACGG 1260 

40 

GAACGGGAAA. TGGAGCGGCG GGAGCGGACT CGATCAGAGC GTGAATGGGA TCGGGACAAA * 1320 

GTTCGAGAAG GGCCCCGTTC CCGATCAAGG TCCCGTHACC GCCGCCGCAA GGAAGGTGCG 1380 

45 AAGTCTAAAG AAAAGAAGAG TGAGAAGAAA GAGAAAGCCC AGGAGGAACC ACCTGCCAAG 1440 

CTGCTGGATG ACCTTTTCCG AAAGACCAAG GCAGCTCCCT GCATCTATTG GCTCCCACTG 1500 

ACTGACAGCC AGATCGTTCA GAAAJGAGGCA GAGCGGGCCG AACGGGCCAA GGAGCGGGAG .1560 

50 

AAGCGGCGAA. AGGAGCAAGA AGAAGAAGAG C^AAAGGAGC GGGAGAAGGA AGCCGAGCGG 1520 

GAACGGAACC GAGAGCTGGA GCGAGAGAAA CGTCGGGAGC ^iCAGTCGGGA GZGC^ACZGG 1530 

55 GAGAGAGAGA G.-G AAAGGGA GCGGGACAGG GGGGAGOGAG ATCGGGATAG GGAAAGGGAC 1740 

CGAGAACGAG GCAGGGAAAG GGATCGCAGG GACi.C-CA.2jGC GCCACAGCAG AAGCC'GGAGT 1300 

CGG^GCACAC C '^GTGCGGG A CCGGGGTGGG CGCCGCTAGC TGGGAAP-ACA CTA.GAGCTGC 1360 

60 
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AGGT AC CAGC CACTCGGCCC CAGGGGGTTA TGGCCACAGA GGGATAGGCA CAGTCTCC-jC 1920 

CACCCTGGAG CCAAGGGTCT TTCACATCAC CTATCCCTAC AT ACAT AC CA AATGGAAAAG ir30 

'5 TGGCCA.TCCT TTTCCCCCCA AACACACCCC CTTAA.CCTAT CTCTTGOGAC TTAGCCCGAC 224C 

CCTCCCTCTC ATTTCCCATT AAGTCTGAGA. GGCAAC-AGCT AGGTTAGGCA AGGAGGTC-GT 2 IOC 

TGGCCA.GA.GA TGGGGAACAG CCAGGTGCCC CA.GTCCTCTG ATTTTTCCTC CATCCTGCT? 2150 

10 

ACCACCTCCC TGGGTACTTA CAGCCTTCTC TTGGGAACAG CCGGGGCCAG GACTGGGTCA 2120 
CCTATGAGCT GAATCAGCAT CTCCTCCTGA GTCCCAGGGC CCCTGCAGTT CCCAGTCTCT ' 2230 

15 . TCTGTCCTGC AGCCCTTGCC TCTTTCCCAC AGGTTCCACT TTATATCCAC CTTTTCCTTT 2240 

TGTTCAATTT TTATTTTTAT TTTTTTTATT ATTAAATGAT CTGGTCTATC- GAAAAAAAAA 2 400 

TAAAAATCTG ACTTAGTTTT A 2421 



20 



25 



(2) INFO?J*ATION FOR SEQ ID NO: 47: 



(1) SEQUENCE CHARACTER-I ST IC S - : 

(A) LENGTH: S40 base pairs 
(3) TYPE: nucleic acid 
(C) STSANDEDNESS : double 
30 r (D) TOPOLOGY: linear 

(:ci) SEQUENCE DESCRIPTION: SEQ ID NO: 47: 

CTCAAACTCC TGAGCTGAAG CGATCTACCT GCCTCAGCTA GGATTACAGG TGTGAGC CAC '. 50 

35 

CGCACCCAAC CTCAATAAGC KTATTTGATA AAA2CATATGC AAGCTCCCTT TATKCACTTT 120 

TCATTCAGAA TGTTTAGTAA TTTGTATTGT TTTTCAGATT TTCAGCCCAA TATATCTCCY 120 

40 TGCCCACTGT GTCACTGTAT TCTACCTAWA CATCATCACG TGTTTCTGCT ATTGGCTGTA 240 

TGATGGAACA CTGCGGCTCA TTTTCCTGAA AACTGCCGAT AGTGCATAGA RTGCTGGGAI 200 

GGAAACCAGA ARCTTTGAAT TCAAGCCTTG GTTCTGCCTT GTTTTTGCTT GGGTGGCCTT 250 

45 

GAGTCAGCCA CATACCTTTT AAAAICTCAA TTTATTAGAA ATTATTCCAA ATCAAAATCA 420 
AATGAGAAGG TATATACAAA AGTGCTTTA.T CCCACAA.TAA ACTATTCAAG AGAGA.GC.A-A, . 430 

50 GGAGAGGACA TTTACTCAAC ACCTCCTAAA ACGK2A.GCCAC- TGAAATTAGG CATTTTA.TTT 540 

AATCCTCCTG GCAACTCTGA GAGTAAAGCA. TTATTAATCC CATTTTGGCT GTTTAAAG.-A 500 

ATTATTTGCA CTAGATTCCA GCTGTAGTTT AGYTTCAGAA AAAAAAATCC TGAGATGTGA 560 

55 

ATTCACAGCT TTCTGGGTTT AAAGCCCAAG CTCTATCA.CA TCATGCTATT ATTGTTACAT 720 

TAGTGCTAGT TCTATGAAAA GAAATACTAA. TTTATGAAAT. ACATCTTATC CAAAAAAAAA. 730:. 

60 AJ^AAAAAAAC TGGGAGGGGG GGCCCGTACC CAAATCGCCG GATAGTGATC C-TAAACAACC 540 
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20 



(2) IMFQ RMAT ION FOR 3EQ ID MO: 
(i) SEQUEMC 



(A) LSMGTK: 2432 base pairs 

(3) TYPE: nucleic acid 

10 (C) 5TRAMDEDNESS : double 

(D> TOPOLOGY: linear 

(XI) SEQUENCE DESCRIPTION : SEQ ID MO: 43: 

15 GGCACGAGGC CCGGAACGCT GAGGAAGGGC CGGTCCOGCC TTCCCCGGCG CGCCATGGAG 60 

CCCCGGGCGG TTGCAGAAGC CGTGGAGACG GGTGAGGAGG ATGTGATTAT GGAAGCTCTG " 120 



CGGTCATACA ACCAGGAGCA CTCCCAGAGC TTCACGTTTG ATGATGCCCA ACAGGAGGAC 130 

CGGAAGAGAC TGGCGGASTG CTGGTCTCCG TCCTGGAACA GGGCTTGCCA CCCTCCCACC 240 

GTGTCATCtG GCTGCAGAGT GTCCGAATCC TGTCCCGGGA CCGC*ACTGC CTGGACCCGT 300 

25 TCACCAGCCG CC3GZGCCTG CAGGCAYTAG CCTGYTATGY TGACATCTCT GTCTCTGAGG 360 

GGTCCGTCCC AGAGTCCGCA GACATGGATG TTGTACTGGA GTCCCTCAAG TGCCTGTGCA 420 

ACCTCGTGCT CAGCAGCCCT GTGGCACAGA TGCTGGCAGC AGAGGCCCGC CTAGTGGTGA 430 

30 

AGCTCACAGA GCGTGTGGGG CTGTACCGTG AGAGGAGCTT GCCCCACGAT GTCCA.GTTCT 540 

TTGACTTGCG- GCTCCTCTTC CTGCTAACGG CACTCCGCAC CGATGTGCGC CAMAGCTGTT 600 
35 TCAGGAGCTG AAAGGAGTGC GCCTGCTAAC TGA.CACACTG GAGCTGACGC TGGGGGTGAC ■ 660 

TGCTGAAGGG AACCCCCCA.C CCACGCTCCT TCCTTCCCAA GAG ACT GAGC GGGGCATGGA 720 
GATCCTCAAA GTGGTCTTCA ACATCACCCT GGACTCCATC Jk£.GGGC<L\GG TGGACGAGGA ' 730 

40 

AGACGCTGGC CTTTACCGAC ACCTGGGGAC CCTTCTCCGG CACTGTGTGA TGATCGCTAC 340' 

TGCTGGAGAC CGCACAGAGG AGTTCCACGG CGACGCAGTA ASGCTCCTGG GGAACTTGCC 900 
45 CCTCAAGTGT CTGGATGTTG TCCTCACCCT GGAGCCACAT GGAGAGTCCA CGGAGTTCAT 960 

GGGAGTGAAT ATGGATGTGA TTCGTGCCCT CCTCATCTTC CTAGAGAAGC GTTTGCACAA 1020 

GACA.CACAGG CTGAAGGAGA GTGTAGCTCC CGTGCTGAGC GTGCTGACTG ?u\TGTQCCQG 1030 

50 

GATGCACCGC OCAGCCAGGA AGTTCCTGAA GGGCCAGGTG CTGCCCCCTC TGCGGGATGT 1140 

GAGGACACGG OTTGAGGTTG GGGAGATGGT C03CAACAAG CTTGTCCGCC TCATGACACA 1200 

55 CCTGGACACA GATGTGAAGA GGGTGGCTGC CGAGTTCTTG TTTGTCGTGT GGTCTGAGAG 1250 

TGTGCCCCGA TTCATCAAGT ACACAGGCT A TGGGAATGCT GCTGGCCTTC TGGCTGCCAG 1320 

GGGCCTCATG GCAGGAGGCG GCCCGAGGCC > AGTACTCAGA GGATGAGGAC ACAGAGAGAG 13a0 V 

60 
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ATGAGTACAA GGAAGCCAAA GCC^GCATA-A ACCCTGTGAC CGGGAGGGTG GAGGAGAAGC 1440 

CGCCTAACCC TATGGPJGGGC A2GAj2AGAGG AGCAGAAGGA GCACGAGGCC ATGAAGCTGG 1500 

5 TGACCATGTT TGACAAGCTC TCCAGGAACA. GAGTCATCCA GCCAATGGGG ATGAGTCCCC 1550 

GGGGTCATCT TACGTCCCTG CAC-GATGCCA TGTGCGAGA.C TA.TGGAGCAG CAC-GTCTCCT 1620 

CGGACCCTGA CTCGGACCCT GACTGAGGAT GGCAGCTCTT CTGCTCCCCC A.TCAGGACTG 1630 

10 

GTGCTGCTTC CAGAGA.CTTC CTTGGGGTTG CAACCTGGGG AAjGCCACATC CCACTGGATC 1740 

CACACCCGCC CCCACTTCTC CATCTTAGAA ACCCCTTCTC TTG ACTCC CO TTCTGTTCAT 1300 

15 GATTTGCCTC TGGTCCAGTT TCTCATCTCT GGACTGCAAC GGTCTTCTTG TGCTAGAACT 1360 

CAGGCTCAGC CTCGAATTCC ACAGACGAAG TACTTTCTTT TGTCTGCGCC AAGAGGAA.TG 1320 

TGTTCAGAAG CTGCTGCCTG AGGGCAGGGC CTACCTGGGC ACACAGAAGA GCATATGGGA 1980 

20 • 

GGGCAGGGGT TTGGGTCTGG GTGCACACAA AGCAAGCACC ATCTGGGATT GGCACACTGG 2040 

CAGAGCZ4AWT GTKTTGGGGT A.TGTGCTGCA CTTCCCAGGG AGAAAACCTG TCAGAACTTT 2100 

25 GCATACGAGT ATATGAGAAC ACACCCTTCC AAGGTATGTA. TGCTCTGTTG TTCCTGTCCT 2160 
GTCTTCACTG AGGGCAGGGC TGGAGGCCTC TTAGA.CATTC TCCTTGGTCC TCGTTCAGCT ' 2220 

GCCCACTGTA GTA.TCCACAG TGCCCGAGTT CTCGCTGGTT TTGGCAATTA AACCTCCTTC 2230 

30 

CTACTGGTTT AGACTACACT TACAACAAGG AAAATGCCCC TCGTGTGACC A.TAGATTGAG 2340 

ATTTATACCA CATACCACAC ATAGCCACAG AAACATCATC TTGAAATAAA GAAGAGTTTT 2400 

35 GGACAAAAAA AAAAAAAAAA AAAAAAAAAA AA 243 2 

40 (2) INFORMATION FOR SEQ ID NO : 43: 

(i) SEQUENCE CKAFACTEFJISTICS : 

(A) LENGTH: 1742 base pairs 
(3) TYPE: nucleic acid 
45 (C) STRANDEDNE3S : double 

(D) TOPOLOGY: iinear- 

' (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 49: 



50. GTCCTGCAGG AGCTGCACGC GGCCGAGGTG CGCANGAACA AGGAC-CAGCG AJ3AAGAGATG 60 

TCGGGCTAAG GGCCCGG3AC GPGSGGCGCC CATCCTGCGA. CGGAACACGT TCGGGTTTTG 120 

GTTTTGTTTC GTTCACCTCT GTCTAGATGC AACTTTTGTT CCTCCTCCCC CACCCCAGCC 130 

55 

CCCAGCTTCA TGCTTCTCTT CCGCACTCAG CCGCCCTGCC CTGTCCTCGT GGTGAGTCGC 240 

TGACCACGGC TTCCCCTGCA GGAGCCGCCG GGCGTGKAGA CGCGGTCCCT CGGTGCAGAC 300 

60 ACCAGGCCGG GCGCGGCTGG GTCCCCCGGG GGC CCTGTG A GAGAGGTGC i GGTGACCGTG 360 
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GTAAACCCAG GGGGGTGGCG TGGGATCRCG GGTCCTTA.CG CTGGGCTGTC TC<3TCAGCAC 420 

GTGCAGGTCA GGGCAGGTCC TCTGAGCCGG CGGCCGTGGC C^^CAGGCGA GGCTACAGTA 430 

5 

CCTGCTGTCT TTC-CAGGGGG AAGC-GGCTCC CCATGAGGRA GGGGCGACGG GGGAGGGGGG 540 

TGATGGTGCC TGGGAAGCCT GCKTGTGCAN CCGGTGCTTG TTGAACTGGC AGGGGGGTGG 600 

10 GTGGGGCCTG CAGCTTTCCT TAATGTGGTT GCACAGGGGT CCTCTRAGAC CACCTGGCGT 660 

GAGGTGGACA CCCIGGGCCT TCCTGGAAGC ' CTGCAGTTGG GSGCCTGCCC TGAGTCTGCT '720 

GGGGAGTGGG CATTCTCTGC CAGGGACCCA TGAGCAGGCT GCATGGTCTA GAGGTTGTGG 780 

15 

GCAGCATGGA CAGTCCCCCA CTCAGAAGTG CAAGAGTTCC AAAGAGCCTC TGGCCCAGGC 840 

CCCTCCGTGG GACAGCCCCG CCGCCCCTCC CCACCAGGGC TTTGCAGATG TCCTTGAAAG 900 

20 ACCCACCCTA GAGCCCTTTG GAGTGCTGGC CCCTCCTGTG CCCTCTGCCC TGGTGGAAGC 960 

GGCA3CACAA GTCCTCCTCA GGGAGCQCCA AC-GGGGATTT TKTGGGACCG CTGCCCACAG 1020 

ATCCAGGTGT TGGAAGGGCA GCGGCTAAGG TTCCCAAGCC AGCCCCAACA CCCTTCCCAC 1080 

TTGGCACCCA GAGGGCGCTG TGGGTGGAGG CCTGACTOCA GGCCTCTCCT GCCCACACCC 1140 

' TCTGGGCTGA GTTCCTTCTT TCCCTTGGAC GCCCAGTGCT GGCCTTGGAG GACGGTCAGC 1200 

30 TGGAGGATGG CGGTGGGGGA GGCTGTCTTT GTACCACTGC AGCATCCGCC ACTTCTCGAC 1250 

GGAAGCCCCA TCCCAAAGCT GCTGCCIGGC CCCTTGCTGT AAAGTGTGAA GGGGGCGGCT 1220 

GAGTTCTCTT AGGACCCAGA GCC\CGOCCC TCAACTTCCA TCCTGCGGGA GGCCTVGGCC 1330 

35 

GGGCACTGCC AGTGTCTTCC AGAGCCACAC CCAGGGACCA CGGGAGGATC CTGACCCCTG 1440 

CAGGGCTCAG GGGTCAGGAG GGACCCACTG CCCCATCTCC CTCTCCCCAC "OAGACAGCC 1SO0 
40 CCAGAAGGAG C^GCCAGCTG GGATGGGAAC CCAAGGCTGT CCACATCTGG CTTTTGTGGG • IS 60 

ACTCAGAAAG GGAAGCAGAA CTGAGGGCTG GGATATTCCT CATGGTGGCA GCGCTCA.TAG 1520 

CGAAAGGCTA CTGTAATATG CACCCATCTC ATCCACGTAG TAAAGTGAAC TTAAAAATTC 1530 

45 

AA.TCAAATGA ACAATTAAAT AAACACCTGT GTGTTTAAGA AAAAAAAAAA AAAAAAACTG 1740 

CG 1742 

50 

(2) INFORMATION FOR SEQ ID MO: 50: 

55 (ij SEQUEMCZ CrJkRACTEPSSTlCS : 

CA) LENGTH: 1437 base pairs 
(3) TYPE: nucleic acid 

(C) STRANDEDNE55 : double 

(D) TOPOLOGY: linear 

60 
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(Xi) SEQUEMCZ CESC?-I?TICN : SEQ ID MO: 50: 



GGCACGAGCC TCCGCCAACT GTGGAjGTCGG CGGAGGGCTG GAATCAGCGT GGGCTCCAGG SO 



TCGCTGGCAG CCGGGTGGCA GAACTCTTCC GAGGCTCCTT GGGAAGAAC-C TA.CACCCGAG 120 



GGAGCCGGAT GGGCCTCGAA AACCTGGCCC GCTCTGGTTC TGTACCATTG CAAGGGGAAC 130 



CGTAAACTGA GCTTTTCTAA CGTGGGTTTC TGCC-AGTAC TTTTCCAGCT GCCCCCTTCC ' 240 

10 

CCCCAGCACA CAGGAGAGCC TCTGTGTAGC CAGCGCTTGA OjGTCGTTAG GTAGGTTGTA 200 



CTGTGTAGGG AGGAGCTCAA GA.TCATGAAT GGTTGTCACA GGAGAAAGCG GTTGCATCTT 260 



15 TGCAAAACTA. TAT AC CTGCT GTGGTTTGTG TTTTCTTTTC TGCTGAGTAA. TGAAGTTGTA 420 



AGTTCACACT GGCACATTCT CAGGGCTGTG CAGATTATTT GCACTTTATT TCATAGGTGE 480 



ATAAGTGCTT TTTAGCTTTC TTTGTATATT GAGTTGCTTT TGAATTGGTT CCCATATTTT 540 

20 

TATTTCATAC AAACTGAACA ATTGTGGCCC CTCTATTTTA TTTATAAAGG TTCAGTGTAT 600 

CTTTGCCTGC CTACATCAAT CTGCAAGGGA GTTGCAGAAA GCCTCATGTT CATCGAGCCG 660 



25 TGAGTCA.CAA CCAATTTCTA AGCTGTTATA ACAAAAAAGT GTTTGCTTTT TTTCACAAGT 720 
AACTTTAAAA GTGTAGTTTA GAAAGAAAAC ATTTTCAATA AAAAGA.CA.CT ACATTAATCC 730 



TGG ATGCTTG , CAAATCCTAA AAXMTATTCC TCCTCTAGGG TTGCACAGCT CTGTGTTGTA S4Q 

30 

•TACACAGACT AGCTTTAAAA TTTGTCACAT AGCACTTTAC CTTTACTTTT ATGTATCATT 900 



CCCCCGACTT CCTTACTGCA GGTGTGGGCA AGAAAACTTT TCCTTTAACA. CTTTT C AACA. 960 



35 CCGGGCAT AA AATTCTGCAG CTGAGGTCTT GAAGAATGCA GATGGGTACA GTA.TGTGTTG 1020 



GAGCTCACAG TGTGTATTGA CTAACCTAGT TCCTTTTTTG CTTTTTTTGG TATTGTCTTG 103 0 



TTAAAAGTGA CTCCCAGGTA GCAACTCTCT TTTTTAAGGG TGGGAACGAA AGGGA.CGTAG 1140 

40 

-GAAGAATAGA TCTAGATTAT TTAAGAGTCT TCGATAGAGT TTGAAAGCTT TCTTCTTCAT 1200 



TCAATTTTGG GCAAAATACT GCCTCTGCAT TTGTTCATAA C-AAAAGATT AGATTAATAA 1250 



45 GTAjGCTTTTG TTGGTGGAAA TTACCAGCTC TATAAGTCAC CCTTGGTGGT TCATGGACCT 1320 



CTGATTAjGCT TGGGTTTTGC AGTCTCATTG CCACATGTA.T ATGTGGAGCC AATGGCCTTT 1380 



TGGTGGTCAG CTGTTTACGT CTGACTCCTT GACTTCTTTG GTACAGTGAT GGAGTCAGAT 1440 

50 

CTCATTAAGT GTGATTCTCC ATGGATATAA CCAGGCCCAA AAAAA2TG 1437 



55 

(2) INFO EMAT ION FOR SEQ ID MO: 51: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH : 1323 base pairs 
60 (3) TYPE : nucleic acid 
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(C) ST5ANDE31E3S : double 

(D) TOFOLGGY : Linear 

(:ci) SEQUENCE DESCRIPTION : 5ZQ ID MO: 31: 

5 

GGCACGAGCT CGTGCCGAAT TCGCCACCAG AGAAGATTTG &AGAAGCC3G ATCCAGCTTC 60 

GCTGGGGGCT GCTTCTTGTG GGGAAGGGAA AAAGAGGAAG GCCTGTAAGA ACTGCACCTG 120 

10 TGGCCTTGCC GAAGAACTGG AAAAAGAGAA GTCAAGGGAA CAGATGAGCT CCCAACCCAA 13 Q 



GTCAGCTTGT GGAAACTGCT ACCTGGGCGA TGCCTTCCGC TGTGCCAGGT GCCCCTACCT 240 

TGGGATGCCA GCCTTCAAAC CTGGGGAAAA GGTGCTTCTG AGTGATAGCA ATCTTCATGA 300 

15 

TGCCTAGGAG GTTGCTGACA TGGGAC CCAT CTGCTCCTCC AGCCAACTCC TGTCCCTCAC 360 

ATCCCACCA? GGTGGCTCCT CCCACCTCCT CTGGATTTGT " TC ACTCTGAG ATCTGTTTGC 420 

20 AGAGTGGGTG CTTAGCAGAC AGAGTG AAGC ' TGGCTGGGGG CGACAGTGGT GTGTAGTGCT 430 



GCTGTGTATC AAAAGACCAA GGTATTATGG GACCTGGTTT CAGAATGGGA TGGGTTTCTT 540 

CACCTCATGT TAAGAGAAGG GAGTGTGTCC TGAAGAAGCG CTTCTTCTGA TGTTAAAATG 600 

25 

CTGACCAGAA CGCTCTTGAG CCCAGGCATC GTTGAGCATT AACACTGTGT GA.CAGAGGTG 660 

CAGACCGCTG CCTTGAGTCT CATCTCAGCA ATGCTGCCA.G CCTCTTGTCT TTGAGAGTTG 720 

30 TTAGTTTACT CCATTCTTTG - TGACACGAG? CAAGTGGCTG ACAACCTCCT CAGGGCAGCA 73 Q 

GAGGA.GTCAC TCACTGGTTG CTGTGATGAT ATGCAGTGTC CCTCTGCCCC CTTCCATCCC 340. 



CAACCACATT TGACTGTAGC ATTGCATCTG TGTCCTGTTG TCATTTATGT TAACCTTCAG 900 

35 

GTATTAAACT TGCTGCATAT . CTTGACATAT CTTGAGATTC TGCATGTGTT GTAAAGAGAG 960 



GGGATGTGCA TTTGTGTGTG ATGTTGGATA GTCATCCACG CTCAGTTTGG ACGATTGGAG 1020 
40 GAACTTAGTG TCACGCACAA ATGGGGCTAT TCCTACGCTT AGAATAGGGC 'TTGTGTGGCC 1030 



ACTTTAGAAG AGTCCCAGGT TGGTGAGCA.T TTAGAGGGAA GCAGGGCAGA. ACTGTGAACG 1140 



ACAATACGTC TCTGTGAGCA GAGAC CCCTT TGTTCTTGTT ATCCACCCAT ATGGACTTGG 1200 

AA.TCAATCTT GCCAAATATT TGGAGAGA.TT GTGTGGATTT AAGAGACCTG GATTTTTATA' 1260 

TTTTACCAGT AAATAAAAGT TTTCATTGAT ATCTGTCCTT GAAAAAJ^A-AA AAAAAAAAAA 1320 

50 AAACTCGA ' 1323 



55 (2) INFORMATION FOR 3ZQ ID NO: 52: 

(i) SEQUENCE CKA3ACTE?LISTICS : 

(A) LENGTH: 1356 base pairs 

(B) TYPE : nucleic acid 
60 (C) STFANDEUNE53 : double 



WO 98/54963 



PCT/US93/H422 



311 

(u) TCGCLCC-*: lir.^r 

sg;l— :zgg?g:?7ig::: £z~ n- 52: 

5 GAATTCGGCA CGAjGCTCTGC AAC.-JZTSCAA. AGG.--AGTrGG AGtCGAGGGG TCCC3CTGCCC 60 

ccgacattaa attccccggg cTG.-----.rrG.-- gtgg-gag-gg tagaata.tca tattggaaag 120 

TGGTGTCTTG AATGAAACCA TTTAGGAZ GA TAAGGAAGTT 7GAGGATGGG GATGCATGGT " 130 

10 

. tttccl-jggcc ttcgtggttt gtagaa.-agg a.-aggggg-g aaag7gtttc acttatatic 240 

TTGAAACATG ATC-GTAATTT AAATTAAGGA CTTCGTAGG-. 7AGGGGATGA. TTCCTATGAT 300 

15 TTTGCGACTG TTAGTAGTTC TCTCAAAAAG AG\GG7A£GG .-AGAGGATGA TTTTAAGT?A 250 

TTTGATTATC TTTCGATGTG TTGTAGGGAG TTGTGAGTGA GTGAAGAAAG TGGTTCGATT 420 

GGTTGGCATT' GATACAGTAA ATGGGTAAAG GAGGAGACA-. GAGAAAAAAT CTAAATTACT 430 

20 

TGTCCTTAAT GACTGTAGCA GAATSC TG7CG AAAG Z AG.-GGGTGGG TCTTGCAGTT 54 Q 



TAGTTTGATA GATGTGGAAG ■ CTA.TGG7GGT TC G-GGAA7T 7AGG7G-GGG7 GGTAGGAACG 60 0 

25 CAGGCTTCTT TGTCTCGGGT TGGAGGGGGG ATGATGGC Z Z GATGAGGCAG ACAACGTAGC 650 



CGGAGATCAC AAAGGACGCC CTGGGGGGAG TTGCTAGGG7 G7GG-GGTGG AGAGAGGTGG 720. 

GCAGAAACTG ACC7CAGTGG GGAAGGGGGG CGATG-GAGGT G-GGC7TTAA TGCACTCTAT 730 " 

30 . 

GTGTTGAGGA AGO GACAGGC CATATTGGAG TG7GAGAA-G AAAAGAAGAG GAAAAACCCC 340 



ACAAAGTATA AC-A.CCGCTT AAGATA.G-GG TAGG7GAAAG TGAAAGTAAG TTTTCAGGGT 900 
35 ATACCATTGG CCAATTACAA GA.GAAAAA.rG TTCAATTTCT TGAAGAATCC TTTGTTGACT 960 



TGGGTTTTCA TCTCTTGCTA TTTAGAGGGG TG-GGGGG.-G 7GAAGAAAGG CTTATTTGCT 1020 



GAGGAAGGAC -TTTGCTGCAC TTA.CTG7A7 7 ACA7GAAAGA G7GG2GAGGG TGGTGTTTAA 1080 

40 

CTTTTTAAAA AATGTTA.TTC TGATTAGAAG .-AG AA-GATI G- 3GG7GGTTGA. TGAAAACAGC 1140 



- GCCACCTTGC AAGGTTTAGT GAGATGGA7G G-AGGGG-AG AG 77 AA.GGAG GAATTGCTGC 1200 
45 " TAGCTCCAAA AATTTQCGAA GCAAAA7CGA GC7GGAAG77- 3~GG-3AAj3T TTGAAA.CTGA 1260 



TTAACAGATT TGCATTTGAA. GTGACGGGAG AGA7GAGG7G GAGAG-.TTAG TTAAAAA.TAG 1320 



AAAGAGGAAT AAAG-jGATGT YTTCTGTGGA GAA.-AGATAA GAGG?GAAGT AATAATCGTT 1330 

50 

CCCACTTTCA TTGAGATGAG CTTGTGTGAG AA2G7GATA7 GAG7GTGAGA ATGATAAAGA 1440 



TGA.TAATAGT C^AjCTTTTG TAATT77GCG GG7GGAGTGA AGA-GATAGT AAAKGATGAG 1500 
55 HCAIflCXTTT— ctvggaacat ycctagvggt AGAjIGTAGTG GACG7GAAAT TGGGAATTAT 1550 
AAjCTGTCCTA ATTTTTGTTG TGGACCG7GA TCG2CG7T77 GG77TAATAG CGACAGTGTA 1520 



ACAATTAAAT ATGAGACTAG GACATAGGAG T7AAG7AGGA TATTGGAAAG ATAAATTTTA 1630* 

60 
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GGGGTAAATG TTTACTTCAA AATGACTCCA TATTTCAAAT ATCTGTLT.-G ACT GTCAAGG 1740 

CCAAATAATT TTTAAGAAAA CATTTGAAGA GTAGTGTGTT TGOJTTTGTG AATAATCTTA 1300 

5 CTO.CAGC-A GTAAACGTAA TAAAAGGCAA O.TTT AAGC C AAAAAAAAAA AAAAAA 1356 

10 (2) INFQEMAT XGN FOP. 3EQ ID NO: 53: 

(1) SEQUENCE CKAFACTEF-XSTXCS : 

(A) LENGTH: 1553 base pairs 
(3) TYPE: nucleic acid 
15 (C) STHANDEDNES3 : double 

(D) TGPGLCG" f : linear 

(xi) SEQUENCE DESCRIPTION: SEQ XD MO: 53: 

20 TGGGTATCCA TTCCTGNAAT TACTTTACTT AGGATAATGG CCTCCAGCTC CGTCCAAGTT 60 

GCTGCAAAAG GTATTATTTC GTTCC TTTT T GTGGCTGAGT AGTATTCCAT GGTGTATATA 120 

TACCACATTT TGTTTATCCA CTCATTGCTT GATGGGCAGT TAGGTTGGTT CCACATCTTT 130 

25 

GGAATTGTGA GTTGTGCTGC TCGAGATATC ATCTTTAACT CCTTTGCCTT CTCCACATAC' 240 
ATTTCCAAGT CCTGTTCATT CTACCTCCAA AATGTATGTT GTATCCATTC ATCTCTCTCC ' 300 

30 ATCTTCAATC TATTTCAATG CCCCATCATC TCTTGCATGG AGGAGTGTAA TAATTGGCTA 360 

ACTGGGCTGT TCTTACATTT TAAAATCAAA A3ATGTGACA GGTGAAATGG CTATTTCAGT 420 

GTCCATTGAT GGTTCTGGTT ACACACCACC TGGCTGGCTG GTGTCGCAGT GGCAGAGTTG 430 - 

35 

AGCAGTGTGA AAAAGACTGC TTGGCCCTTT ACAGGGAAAG CAGGTCCACT GTGGCGTGTG 540 

AGGACGAGAG CTCTGGGCAG GGTCGGACAC TGGCAGACCC TGGTCCTGGC TGGCCAAGGC 600 

40 AGCAGGGTAT GTGTTTCGGG TCACTCACAG GGCTCAGCAC CACTGCTCAT GGCTTCCTTA 660 

CTGTTTCGGG AGAGGCTGAC CCGCGGCTGA TTGAGTCCCT CTCCCAGATG CTGTCCATGG 720 

GCTTCTCTGA TGAAGGGGGC TGGCTCACCA GGCTCCTGCA GACCAAGA^C TATGACATCG 780 

45 . 

GAGC'GGCTCT GGACACGATC CAGTATTCAA AGCATCCCCC GCCGTTGTGA CCACTTTTGC 340 

CCACCTCTTC TGCGTGGCCC TCTTCTGTGT CATAGTTGTG TTAAGGTTGG GTAGAATTGC 90 0 

50 AGGTCTCTGT ACGGGGCAGT TTCTCTGCCT TCTTCCAGGA TCAGGGGTTA GGGTGGAAGA 960 

AGCCATTTAG GGGAGGAAAA CAAGTGACAT G-AG-GGAGGG TCCCTGTGTG TGTGTGTGGT 1020 

GATGTTTCCT GGGTGCCCTG GGTGCTTGGA GGAGGGGTGG GGCTGGGAGA CCCAAGGCTC 1080 

55 

ACTGCAGCGC GCTCCTGACC CCTCCCTGCA GGGGCTACGT TAGCAGGCCA GCACATA3GT 1140 

TGCGTAATGG CTTTCACTTT CTCTTTTGTT TTAAATGAGT CATAGGTCCC TGACATTTAG 1200 

60 TTGATTATTT TCTGCTACAG ACCTGGTACA CTCTGATTTT AGATAAAGTA AGCCTAGGTG 1250 
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TTGTCAGCAG GCAGGCTGGG GAGGCC-JGTG TTGTGGGCTT CCTGCTGGGA CTGAGAAGGC 1320 

TCACGAAGGG CATCCGCAAT GTTGGTTTCA CTGAGAGCTG CCTCCTG3TC TCTTCACCAC ■ 13 30 

5 

TGTAGTTCTC TCATTTCCAA ACCATCAGC? GCTTTTAAAA TAAGATCTCT TTG7AGCCAT 1440 

CCTGTTAAAT TTGTAAACAA TCTAATTAAA TGGCATCAGC ACTTTAACGA AAAAAAAAAA 1300. 

10 AAA-AAAAAA AAANAAAAAA AAAAGGGGGC CGCTCTACAG GTCCAAGTTA NGACGNGG 1553 

15 (2) INFORMATION FOR SEQ ID WO": 54: 

(i) SEQUENCE CHAHACTERX5TIC3 : 

(A) LENGTH: 943 base pairs 
(3) TYPE: nucleic acid 
20 (C) STRANDEDNES3 : double 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 54: 

25 TAAAAATCAT GCTCTGTACC ATCCTCACCG TAGTCATGAT CATCGCCGCG GAGACGACGA 60 

GAACTACTGG GATGGCTAAA AACGCCCCTG GTCCGGCCCC ACTGTGGGCG CCTCGATCTC . 120 

CCAGGCTCTT TCTGGAGWCA TACCGGGGAG CCAATGGGCG CGGTGGACAG CCGTTTCTGG 130 

30 

GGCCGTCAGA CTTGGATACA TCGTAAACTC CGCCTCCACG GAACGTGTCG CGTKGCGAGC -240 

AAGCTCGGAA TCCAGTTCCT G*GGAACCCC TCCAAAACCG ACACCCCCAG GGACGCCGCT 300 

35 TTCCGGGATC CCGGSCAAAG GCCGGACCCT CAGTCGCTCC AGGGCCCGTC ACCCTCAAAG 360 

TGTAGCGCCC CCAACGGAGC AACCTCGGTT TGGTCGGTAA AACCCCGCCT CCTCTATAAG 420 

CACCGCCCCA GCTCTGACAA AACCCCGCCT CCAGGTCGGC AGGCTCCGCT TCTTTTCTTG 430 

40 

TCCGCGGGGT GATTCAGTCG AGTGATTGGG TTTGTGGCTC . CAGOGCTGGC GCACAGACGG 540 

ACAGACCCCT CCCETTCTTC CGGCAAAAGG ACCGAGCCCT GGGGTAGTAA GGSCCCCACA 600 

45 CTCGTGTTTT TTGCAAGTAC ATTTTTGTCC YTCCTCCACC CAGGTATCTG CCTATTTTCT 560 

TGCTAATCCC AGAACCTTTC CTTTTGCTTT TTTTAAGGAC ATTTGGGAAG TTCCTGGTGT 720 

AGGACCCTTG TCCCTGGGAT AAGAAACCTG CCTGTAAACG CTCTGTAAAT ACTCCCTTCC 730 

50 

ACCCATCCCA GCCCCTGGGC AGCCGGGCAG AAGGGAATCC AGGCTATGGA CCTCCCAAGT 840 

CCCCGCTCCC CGCTCCCCTC GGCGGCCCCG CCTTGTTCTG ATCTGTGTGT GAGTGTGTGT 900 

55- GAACTTGTGA AAGACAATAT TAAAGAGACT TAGTTGAAAA AAAAAAAA _ 34S. 

60 (2) IMFORMAT XGN FOR SEQ ID MO: 55: 
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(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 990 base pairs 
(3) TYPE : nucleic acid 
5 (C) STEANDEDMES S : double 

(D) TOPOLCGY: linear 

SEQUENCE DESCRIPTION : SEQ ID NO: 3 5 : 

10 GGGGAACTGC AGTGACAGCA CG-AGTAAGAG TGGGAGGCAG GACAGAGCTG GGACACAGGT 60 

ATGGAGAGGG GGTTCAGC GA GCCTAGAGAG GGCAGACTAT CAGGGTGCCG GCGGTGAGAA 120 

TC C AGGGAG A GGAGC'GGAAA. CAGAAGAGGG GGAGAAGACG GGGGCACTTG TGGGTTGCAG 130 

15 

AGCC CCTCAG CCATGTTGGG AGCCAAGCCA. CACTGGCTAC CAGGTCCCCT AGACAGTCCC 240 

GGGCTGCCCT TGGTTCTGGT GCTTCTGGCC CTGGGGGCCG GGTGGGCCCA GG AGGGGTCA 300 

20 GAGCCCGTCC TGCTGGAGGG GGAGTGCCTG GTGGTCTGTG AGCCTGGGCG AGCTGCTGCA 360 

GGGGGGCCCG GGGGAGCAGC CCTGGGAGAG GCACCCCCTG GGCGAGTGGC ATTTGYTGCG 420 

GTCCGAAGCC ACCACCATGA GCCAGCAGGG GAAACCGGCA ATGGCACCAG TGGGGCCATC 430 

25 

TACTTCGACC AGGTCCTGGT GAACGAGGGC QGTGGCTTTG ACCGGGCCTC TGGCTCCTTC 540 

GTAGGCCCTG TCCGGGGTGT CTACAGCTTC CGGTTCCATG TGGTGAAGGT GTACAACCGC 600 

30 CAAACTGTCC AGGTGAGCCT GATGCTGAAC ACGTGGCCTG TCATCTCAGG CTTTGCCAAT 660 

GATCCTGACG TGACCCGGGA GGCA.GCCACC AGCTCTGTGC TACTGCCCTT GGACCCTGGG 720 

GAC C GAGTGT CTCTGCGCCT GCGTCGGGGG NAATCTACTG GGTGGTTGGA AATACTCAAG 730 

35 

TTTCTCTGGC " TTCCTCATCT TCCCTCTCTG AAGGACCCAA GTCTTTCAAG CACAAGAATC 340 

CAGCCCCTGA CAACTTTCTT CTGCCCTCTC TTGCCC CANA AACAGCAMAA GCAGGANANA 900 
40 NACTCCCTCT GGCTCCTATC CCAGCTCTTT GCATGGGAAC CTGTGCCAAA. CACCCAAGTT . 960 

TAAGAAAAAA ATAAAACTGT GGCATCTCCA 990 



45 

(2) INFORMATION FOR SEQ ID NO: Sc : 

(1) SEQUENCE CHARACTERISTICS : 
50 (A) LENGTH: 1603 base pairs 

(3) TYPE: nucleic acid 

(C) STRAND EDNE 3 S : double 

(D) TOPOLOGY: linear 

55 (:ci) SEQUENCE DESCRIPTION: SEQ ID NO: 56: 

GGTCGACCCA CGCGTCCGGC CGGCCGGCTC CGGAGCGGCT CTGCCTTCCC GAGCGCGGGA 60 
CCGCCCCCTG GGGGAGGAGG GCGAACGACG CGGCGATGGC TCCGCGGGCA CTCCCGGGGT 120 

60 
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CCGCCGTCCT AGO CGCTGCT GTCTTCGTGG GAGGCGCCGT GAGTTCGCCG CTGGTGGCTC 130 

CGGACAA.TGG GAGCAGCCGC ACATTGCACT CC1AGAACAGA GACGAGGGGG TCGCCCAGCA 240 

5 ACGATACTGG GAA.TGGACAC COJGAA.TA.TA, TTGCATACGG GCTTGTCCCT GTGTTCTTTA 300 

TCATGGGTCT CTTTGGCGTC CTCATTTNGC CAMCTNGGTT NAAGAAGAAA GGGTATGGTT 360 

GTACAACAGA. AGGAGAGCAA GATA.TCGAAG AAGAAAAAGG TTGAAAAGWT AGFATTGAA.T " d?0 

10 

„ GACAGTGTGA ATGAAAACAG TGA.CACTGTT GGGGAAATCG TCCACTACAT CATGAAAAAT 480 

GAACCGAATG CTGATGTYTT AAAGGCGATG GTAGCAGATA ACAGGCTGTA TGATCCTGAA 540 

15 AGCGGGGTGA CCGGGAGCAC ACCAGGGAGC CCGCCAGTGA GTCCTGGGCT TTGTCACCA.G 600 
GGGGGACGCC AGGGAAGCAC GTCTGTGGCC ATGATCTGCA T AC GGTGGGG GGTGTWGTCC - 550 

AGAGGGATGT GTGTCATCGG TGTAGGCACA AGCGGTGGCA CTTTATAAAG CCCACTAACA 720 

20 

AGTCCAGAGA GAGGAGACCA CGGGGCCAAG GC G AGGTCAC GGTCCTTTCT GTTGGCAGAT 730 

TTAGAGTMAC AAAAGTGGAG CACAAGTCAA AC CAGAAGG A ACGGAGAAGC CTGATGTCTG 840 

25 TTAGTGGGGC TGAAACCGTC AATGGGGAGG TGCCGGCAAC AC CTGTGAAG AGAGAACGCA 900 

GTGGCA.CAGA GTAGCAGGTG AGCCGTGGTT TTGGTGACAT TGGGGGCAGA. GTGGTGCAGG 960 

■ GTGAGGAGAA GGTACTTGGA GCCTCCCAGG TGCTGTGGCA- GC ATAGGAAT " GGTATTTGAC 1020 

30 

AGGGAAGTGG GAGAGCTTTC CTTGACCCAG GAAGA.CTGAG GGGGACTGAA CATGATTACT 10S0 

TGTCTGCCTA GAGCTTCTTG TAAAGAAGTC ACAAACTTAG TGCCTCCAGG GGCTTGGCTG 1140 

35 TGTGATAATG AGGATAGAGG ATTACTTGTG AGGCAATGTG GCATGGTGGG GATTGTGGCA 1200 

AACTAGAATT CACATCACCC ACCATATAGG GCTTGCATXA CCACGAGGCA GAAAGCACCT 1250 

AGTGTTGCTG CATCTTCTTA CGCAAAAAAG ACAAAATCCA GACTTCTAAA ATGTAAAATC 1320 

40 

ACTGATTTTC GATATTGGCA GCTTACTTTT TTTTTTTAAA CAACCA.TGCA. GGCCAAA.TGA 13 30 

CTTGTAATCT TGTCACCATT TTTAGGTAAA CTGTGACTTG AAAAAGTCTG GAGCAAACAA 1440 

45 ACCAATGCTT ' TTTCCTTTTA TTCTGTTGGR AACCAGTTTT CTTTGTGTCA CAGTTYTGAA 1500 

ACCTCAATAC GAATATTTCT CTTCCCACCA AATATTTTGA GGCAATTGAA AAGCCACAGT 1550 

'GA.TTTA.TTTC TTGATTTGGC AA.TTTTAATT TTGCAAGACA ATT 1503 

50 

(2) INFORMATION FOR SFQ ID MO: 57; 

55 



(i) SEQUENCE C^apACTEKJSTICS: 

(A) LENGTH: 1052 base pairs 
(3) TYPE: r.uciaic acid 
(C) STFANCEDNE33 : double 
60 (D) TOPOLOGY: linear 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 57: 

TACACCTCAG GATGCCTGTA ACATTGTCAT CTCTGGGGTT CTGCGTCCTG CTTAGCCTGC 50 

5 

TTTTTCCCTG GAGGACTGAC CAGGGATGCG GCCCAGCAAC ATGTTACTAA ATCATACTCT 120 

CCTCCCTACC TTTCCCAGAC CTCTCACTCC TGCCTGGTGT TCCAACCCGT TCTGTGGC CA 130 

10 GA3TATACAT TTTGGAACCT CTTCGAGGCC ATCCTGCAGT TCCAGA.TGAA CCATAGCGTG 240 

CTTCAGCAGM AAGGCCCGAG ACATGTATGC AGAGGAGCGG AAGAGGCAGC AGCTGGAGAG 300 

GGACCAGGCT ACAGTGACAG aGCAGCTGCT GCGAGAGGGG CTOCAAGCCA GTGGGGACGC 360 

• 15 

CCAGCTCCGA. AGGACACGCT TGCACAAACT CTCGGCCAGA CGGGAAGAGC GAGTCCAAGG 420- 

CTTCGTGCAG GCCTTGGAAC TOA.GCGAGC TGACTGGCTG GCCCGTCTGG GCACTGCATC 430 

20 AGCCTGAATG AGGCTGGCCA CCTGCCACTT TGCCCTGCCC TCTGCCTCCA GGGCTCCMCT 540 

MYCCTTCCTT TTCTTGGTGA AAGGCACCTC CTTTCCTGAT AATGAATGGT GTTCCCTTTG 600 

CTTGGCTGGG GAGCCCCCCA GGCCAGGTTT GCTGGCCATA GATACCTTTG GGCTGCCTG?- SoQ 

25 

GACAGGCTCC TGAGGAGGAT TGAGGGTGAA AGTCTCCCAC GAGTACACTA AACCTAGGTC 720 

TGGTCACCAA TAGGGTTTGG AGMCAAAGG GCCACAACTC ATCAGCTGCC TGTCTCTTAG 730 

30 " ATGCACTTTC TTTTTCCACC AGCACATCCT TCAACACACA GAATTTCAGG GAAGAGTTCT 340 

CCCGAAAACC CTAGCTCTTT ACCCTTCCAT TTTAGCCTTC CACCCAGCTT CCACAAAAGA 900 

TTTGGCTCTA CCTTGGATCT GCTAGTAAAT A-.CTAATAGG CAGGCAGTTA TTTGGGTAAG 960 

. 35 

GAAAPAAGGG GTGGGPvGAGA O^AAAATTT GCCCACTGCT GCTGCTCCCC TTGGSTYTCC 1020 

ACGTGGGATT TGCTATTCAA TCTCTACCCT MM 1052 

40 

(2) INFORMATION FOR SEQ ID NO: 52: 

45 (i) SEQUENCE CHA21ACTEH2STICS : 

(A) LENGTH: 314 base pairs 
(3) TYPE: nucleic acid 

(C) STHANDEDNESS : double 

(D) TOPOLOGY : linear 

50 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 53: 
ACMCGNTGGC GGCCGCTCTA GAACTAGGGG AMCCCCCGGG CTGCAGGAAT TCGGCACGAG 50 
55 CATAGACTTT TAAACTGGTA CGGTTCTTAG AGATGGTCCT TGGCCTTCTG TTGTTGTTGT - 120 

XGTTTTTTTC TTTTTCTTCT TCTCCTTC TC CTTCT^CTTC TCT T CTCCTT C T T T C TTCTT 130 
TTTTTTTTCA GAGTCTTGCT CTGTCACCAA GACTGC- AGTG A^GTGATGTG ATCTCGGCTT 240 - 

60 
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ACTGCAACCT GGGAC-GCAGA GGTTGCAGTG AGTCGAGATG GTGCCATTGC TCTCGTTTGG 
GCAACAAGAG TGAAACTCTT GTCTCAAAAA AAAAAAAAAA ATGAGCTTTA AGACAGTTTT 
GTCATTACTG GTGGGATCTG GTCACACAAG ATAGCATTAA ACGTC-ACATG GCACATAAAA 
TTCGTTAAAA .-ATTTTGTTT TTTAATTACG TAATGTAAAA GGCCAACAAA O.CTTTATC-C 
AAGA.TTGGAA TGTA.TCTTCA AATTCAGATT TAATAAACAT GT AAAGAT C C TCTGTAXATA 
AAAGTTGTAT TTAATGCCTT GTGCCCCAAG AATGCTATAA AAGATCCCAA GAATGTTATC 
TA.TGAAAAGA TAGCAATAGG GAATGGTGAA CAAATAATTT AATTTGCCAA TTCTAAAAAA 
CATGGACTTA AACCCGATGA AAACTTGGTT CCATAGTTTT AACTGTTTTA TGGTTCCAAT 
ACAAAACCAG AGTGGTTTAC ATTCCACAAT MACCAAATTT GCATCCAATN TTGGGGTAAT 
TTTNGGTAXT TGCCATGGGA TACTATTCAT TTTT 

(2) IKFORMAT ION FOP. SEQ ID NO: 59: 

(i) SEQUENCE CHARACTERISTICS; . - 

(A) LENGTH: 1215 ba.se pairs 
(3) TYPE: nucleic acid 
. (C) STRANDEDNESS: double 
(D) TOPOLOGY : linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 59: 

AGAGGAAGTC TTTTGCCAAG CCTGTTCTCT GGACTAACGC CATCCAGGCT GGGAjGGGGAA 

GAGTGCTCTG CTA.CACTCGT CCCCCTCCTG CCTCATCTTC CTTCTCAGCC TTGGTTCCTG 

ATGGGAACAG AATGGAGGGC CTGAGAACAT ACTTTCTAAA TGCCTTTGAC CCAGGAACCG 

ATTATCTATA TTTGTT COCA TTTTCCTTCA CCGTGACATT CCAGCA T T G T CTGACT G TGA 

GGTGGGCCTT TGAGAGCCTC. CAGGTTCCTC AAAACAGGCC TGAGCGATGG GGATCACACC 

CTCTGCCTAC CCACRTGCCT GCTTACCTGC CAGATAACCA. AGTGMAGATG TCTGCGAGTG 

GCTAGTTTTC ACATTCTTAC TAGTGTTTGG YTCACCTTTG GGCAAAGGCC CCCTCTAGGC 

CTTGCCCCAC CTCCATCAAA CGCAGACACT GTAGTCAGAC CTCAGYAATA TAGGAGGCAA 

TAATCTTTTA ACAGTGTTTT GCAAACAAAC AAAAAGAGAA AAATCCCAGC CAGGGGAACT 

CGCCACCZGC CCACGCTAGT TCCATCCACG CTCAAGACCC . GCCCTTA.GAC CACGCAGGCA 

AAGGCCCCCA TCACACTCGG C'CACTAGTGG GGTCCTGAGG CCAAGAAAjGA AACCAGACCC 

TGTATGACAA GTTGGGKTCT TTC'GAGAACA CGACAGAAAC AGGGGGGGCC CCTTGTTAAT 

GCCACTCCAT ACTCCAGAAG CATTATTCCT TATTTGGGAC AGCCAAGGGC AGA.TTCACAG 

GTTATTGTAG GAATAAAGAC TAGTTTACAA AGGARAAAGA GSCCCTGGAC TTCCQ1AGGA 
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AAGGTCAGGT TAGGGCTCCT GTACCCATTC TGTTCCACCA CTGTTTGATC TCTCTGC-CCT 
CCCACCAGGA ATGCCGTTTC C7TTTTATGG ATCTGTTGGG AACCAGAGAG AAJTCAA.CAGA 
TCAATGACAT AGGATCCGAA GTGCAA.TGAT AGTCACTTCT AGTTTGGCAT TTCACAAACT 
CTGMACAGCA AGGTATTGGT AGGTTACTCA ATTTCAAAAG GGCCCCATGG CCAAATATGT 
TTAGGAACCG CTGTTTGNAT TTCTTTTTTT GGAGACGCAT TGTATATAAT ATATGTCAAA 
GGCTTTCGGA ATTCCTGCAG GAAAGAAATC AGCTTTGTTA AATCCNAAAA AAAAAAAAAA 
AAAAAAATAG ACTCG 

(2) INFORMATION FOR SHQ 10 MO: 50: 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 473 base pairs 

(B) TYPE: nucleic acid 

( C ) STRANDEDNE3S : double 
~ <D) TOPOLOGY: linear ~ 

(xi) SEQUENCE DESCF.I ?T ION : V SEQ ID MO: 60: 

ATTTCTTATG ACATGGGGGT TTGAATTGGT TGGCAAATGT TTAATT-TTAA TATCCATAAT 

CAGTGAGGTC CTGCTGGCTG TAATCATTAA TTGTGAAATC TAAGGAGCTT. .AGTTCATGGC 

TCTAGAATTT CACAGAAAAR TGYGMTATGA TACGAGCATT AAGTTTATTT CTTCTGATCT 

TTGATGCAGC TTTGTTCAGT TTATCTGTTT TTGTATTTAT TGGTCATCTA CTTCCCATGC 

CAAAAGGGAC TGGTCTACAT AGCTGCGCTA AACACCTGAT ■ CAAATC ACT A AAAGAAAATG 

TGTTACCTCT AATGAATTAT CCTGATTGTA AGTTAAAAAT CAATATTTCC CCGTAGTGAG 

GTTTGCTTTT TAAAAAGAAK KCTTAAAAAA AAAAAAAAAA AAACGAGTTN AAGAAAAGGA 

AGCAAGCTCA GGTAAGGTGC AC\CATTGGG CTAAGGAAGC TAGAGCCTGT GGAGAMGC 

(2) INFORMATION FOR SEQ ID NO: 61: 

(i) SEQUENCE C-1AFACTERISTICS: 

(A) LENGTH: 613 case pairs 
(3) TYPE: nucleic acid 

( C ) STRANDEDNES 3 : double 

(D) TOPOLOGY: linear 

(;<iJ SEQUENCE DESCRIPTION: 'SZQ ID' NO : 61: ' ' 

TATGACCTTG ATAACCCCAA GTTNGAAATT AACCTTCANT AAAGGGAACA AAAGCTGGAG 
TTCGCGCGCT TGCAGTTCGA CACTAGTGGA TCCCAP-AGAA TTCC-GCACGA GTCATAATGA 
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GCTACTAGGT AAGCCTTCTG GGACTTTCAG ATATTTTGOG GAAGATTGAT TTTTGTTCTT 

ACATGCTGTG GACCCTTGGC CATCAAATGG TATGGGGAAG CTCATCCGTC TGTCTGTGAT 

GGTCATGTCA GTCAGGCG7C TTtTT.-JGTAT TTACXGGGTG CTCAGTACTG TGCCAGATGC 

TGTCGGGAGC. CGTGGTGGTA TGGAGGAGGA GTGCTCCAGA GGACTCTGCT GTGTGGCAGG 

CCAGCATAAA CAAGCCAAGG GGAAAAGGCA GGCATGGAAT AAAGGGGGAG AATAC-GAGTG 

TGTGACTTAC TGCTGACTGT GTGGATTAGC CTATCAGCAG TAATCA?sGCA GGGCGGAGGG 

CATTATCTTT GAGCCAGAAG AGTGAGCACT GG SCCGAGGG TGGAGCATCA AGAGGGGGTG 

TAGGACCNCA AGGCTTCTTN CNGGGGAGAC AAGGTCAATA AGGNGTGAGT AGTCACCGAC 
AGTTTTGGGA AGCAAGGG 

(2) INFORMATION FOR SEQ I'D MG: 62: 

(i) SEQUENCE CHARACTERISTICS : . 

(A) LENGTH: 751 base pairs 
(3) TYPE: nucleic acid 

(C) STRAHDEENESS : double 

(D) TOPOLOGY: linear- 
is) SEQUENCE DESCRIPTION: SEQ ID MO: 62: 

CGTCCGAGGA GGTGGACTTC TGAGACAGGC ATTCTCCTTG CVTAGCACTG 
CAGCTCATAG AAGTCAACAA TTTTCTTCAA CACTGGTAGG CAGCCTCTAA 
TCACCCTCAC CTCCTGCCAT TCACACCNNT GTAAAATTCC ACCCCTGGAC 
ACTTCTAACA ANGAGAATAC AGCAAAAGTA ACATCGCTTC TGP-GGTGAGG 
ACTACGATGC CTGCCTTGGT CACCCTTCTC CTGCTCTTTC CATTGCTGCC 
GCCAGTTCCC ATGTGATGAG GTGCCCTATG G^^JZOCCC\ CGTGACAAGG 
AGCCTCTGAC CAATAGCCAT CTAGAAACGG AGGCCCAGTC OUGCAGCCTC 
CCTGCCAACC TGAGCTTGGA GACAGATTCT CTCCCTATCC TGCCTTGGGA 
CACCVICAAC ACCTTCACTG CCTGGTGAGA GGC CAAGCCA GTGAACCCAA 
ACAGAATCCT GACCCACAGA AACTGAGATA ATGTTTGTTA TTTTAAGCTG 
TAGAGAGCAA TAGATAACTA ACTC.-AACAC CATAAAATTC TAATATTTTA 
CAAACCAGGT AATACCAAGT AAATGCCATT ACTATACACA TATTTTTGTA 
ATGTGATTTT TTAAGAAGGC T 



TCGACCCACG 
TCTGCTGCTA 
ATGGCCCTGA 
CTAGTGACTC 
CTACAAGGAG 
TCTGATGGAA 
TATTGTAAAA 
• TGAGATGAAT 
TGATCACAGC 
GGTAAACTGG 
CTCAGTTTGT 
TTCTATCACA 
ACACAATTAC 
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(2) -—■ZT-.-C-.ZZZH FOR SZ" ID NO: c2 : 

(i} SEQUSITCE CI^JL-JL-i— STIC3 : 

{.-;) LZ::— 730 ca.se pairs 
(3) TYPE: nucleic acid 
(C) ST7---irZE^r£S3 : double 
(3) TCPClCG"i: linear - 

(:-ii) sequence descezpticm: seq id no: 53: 
o;g::cagtca c:gtccccga tzcccggg^c gzcccacgcg tccgggttgg caactcctga 
gg7gacttca. c-.ttttccta cctctccttc ta-atctcttc ' ?agagcacct 
gctatcccca acttctagac ctgcrtccaaa ctagtgacta ggatagaa.tt tgatccccta 
acrca^ttgtc tgcggtgctc attgctgcta. acagcattgc ctgtgctctc ctctcagggg 
caozadgcta acc-ggr-cgac gt cotaatcc aactgggaga agcctcagtg gtggaattcc 
aggcactttc- actgtcaagc tc-3caac-ggc cagga.ttggg ggaatggagc tggggcttag 
cigsgmchg gtctqaagca gacagggaat ggg.agaggag gatgggaagt aga.cagtggc 
tggtatgggr ctgagc-ctcc crggcxrcctg ctcaagctcc tcctgctcct tgctgttttc 
tcargatttg g3ggcztgg3 agtccctttg tcctcatctg agactgaaat gtggggatcc 
aggatggcct tccttcctct tacccttcct ccctcagcct gcaacctcta _ tcctggaacc 
tgtctt ccct "ctccccaa. ciatgca.tct gttgtctgct cctctgcaaa ggccagccag 
cttc-g3ag--a c-zagagaaat aaacagcatt tctgatgcca aa.-aaaa-aaa aaaaaaaacc 
gcz-gccgaaa c-ctta7tncc ctttaagtaa ggggttaatt tttagcttgg gcactmggcc 

(2) ~ D^Or^LATION FOR SEQ ID MO: o4 : 

(i) SEQUENCE CKAPAGTEPE3TIC3 : 

(A) LENGTH: 533 base pairs 
(3) TYPE: nucleic acid 
(C) 3TF.JEIDEDNES3 : double 
(0) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION : SEQ ID NO: 64: 

TTCCGAA.TTA A-DCGACTCAC TATAGGAAy/T GCCGTCGCCA TGACCCGCGG TAACCAGCGT 

GAGCrCGCCC GCC.-GAAGAA. T ATGAAAAAG CAGAGCGACT CGGTTAAGGG ' AAAGCGCCGA 

GA7GAC-GGCC TTTCTGCTGC CGCCOGCAAG O.GAGGGACT OGGAGATCAT GCAGCAGAAG 

CAG.AAAAAJGG CAAACGAGAA GAAGGAGGAA CCCAAGTAGC TTTGTGGCTT CGTGTCCAAC 

C-rrCTTGCCC TTCGCCTGTG TGC CTGGAGC CAGTCCCACC ACGCTCGCCT TTCCTCCTGT 
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AGTGCTCACA GGTCCCAGCA CCGATGGCAT TCCCTTTGCC CTGAGTCTGC AGCGGGTCCC 360 

TTTTGTGCTT CCTTCCCCTC AGGtAGCCTC TCTCCCCCTG GGCCACTCCC GGGGGTGftGG 420 

GGGTTACCCC TTCCCAGTGT TTTTTATTCC TGTGGGGCTC ACCCCAAAGT ATTAAAAGTA 430 

GCTTTGTAAT TCCAAAAAAA AAAAAAAAAA AAAAAAAAAA AAAAAAAAAA AAAAAAAAAA '540 

AAAAAAAAAA AAAAAAAAAA AAAANNCGGG GGGGGGCCCC CCCCCCCC "533 



(2) IMFOHMAT ION FOP. SHQ ID NO: 65: 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 774 base pairs 
(3) TYPE: nucleic acid 

(C) STRAMDEDNESS : double 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 65: 



TTTAAAGATG AAGAAATGAC AAGGGAGGGA GATGAGATGG AAAGGTGTTT GGAAGAGATA 50 

AGGGGTCTRA GAAAGAAATT TAGGGCTCTG C^TTCTAACC ATAGGCATTC TCGGGACCGT 120 

CCTTATCCCA TTTAATTAAT TTCTCTGACA ATTCAATTAT TTTCTGTTAT TAATGTTGCC 130 

ACTGCTTTCT GTTTGTCTGC ACTTTCTTGA TAAATATTTG CTATC GTT TT ACTCCAGTCA 240 

TTCGATGTTG CTGAGATTTA CATATGACTC TTGTCAACAT - CTCATCTTTT GACCCAATCT 300 ' 

TA.TTCATTTA ATAAGAGGTC TCATTCATTT GCATGGAAAA ATGCTCATTG TAT ATTGCAA 360 

AGTGAAAATA ACGAGTTGCA AAACAGTGTA TACATA.TA.TG TGTGTATATA TGTACACTTT 420 

ATTTGTACAT TTCTATGTGA CATAATGCAA AGGAAAGTGT CTGATTTTAT TATACACCAA 430 

AGGTTAACAG TGAATCTCTG TGTGATCTCT TTTTTTTTCT TTTTGCCTA.T CTGCATCTTC 540 
TCACTTGCCA AAAAATGAAT ATATGTTTAT GTGTGTATAT TACTTGTGTC ACAAAAAACC. - • 500 

CTAAAGTAGA CAGTAAAAGA ACTTGTCAAT CGCCTTTGGA AGGCAATGAA ACACTTAATA 660 

AACTCTCAAT AACAGAAGCG TAAAAATGAA ATGTAAACCT CCAATTACCT CTGGATCTCT 720 

TAGCCAGAGT AATAAACTGG TAATTATTAC AGATAAAAAA AAAAAAAAAA AAMA 774 



^ INFORMATION FOR SEQ ID NO: 66: 

. (i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH : 1366 base pairs 
(3) TYPE: nucleic acid 

(C) STEANDEENE33 : double 

(D) TOPOLOGY: linear 
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(xi) SEQUENCE DESCRIPTION : SEQ ID NO: 66: 

ACCCACGCGT CCGGTCCTCT TCTTCAGCAC ATGCCAAAGC TGTTCCTCAC C-GCCTGTGAG 60 

ACAAGAGCAT CTTGGATGTA G3ACAATGGA AGAGTTAGAT GCCTTATTGG AGGAACTGGA 120 

ACGCTCCACC CITCAGGACA GTGATGAATA TTCCAACCCA GCTCCTCTTC CCCTGGATCA 130 

GCATTC CAGA AAGGAGACTA ACCTTGATGA GACTTCGGAG ATCCTTTCTA TTCAGGATAA . 240 

CACAAGTCCC TTGCCGGCGC ANTOGTGTAT ACTACCAATA TCCAGGAGCT CAA.TGTCTAC 300 

. AGTGAAGCCC AAGAGCCAAA GGAATCACC A CCACCTTCTA AAACGTCAGC AGCTGCTCAG 360" 

15 TTGGATGAGC TCATGGCTCA CCTGACTGAG ATGCAGGCCA AGGTTGCAGT GAGAGCAGAT 420 

GCTGGCAAGA AGCACTTACC AGACAAGCAG GATCACAAGG CCTCCCTGGA CTCAATGCTT 4S0 

GGGGGTCTSG AGCAGGAATT GCAGGACCTT GGCATTGCCA CAGTGCCCAA GGGCCATTGT 540 

20 

GCATCCTGCC AGAAACCGAT TGCTGGGAAG GTGATCCATG CTCTAGGGCA AXCATGGCAT 600 

CCTGAGCATT TTGTCTGTAC TCATTGCAAA GAAGAGATTG GCTCCAGTCC CTTCTTTGAG 560 

25 CGGAGTGGCT TGGMCTACTG CCCCAACGAC TACCACCAAC TTTTTTCTCC ACGCTGTGCT 720 

TACTGCGCTG CTCCCATCCT GGATAAAGTG CTGACAGCAA. TGAACCAGAC CTGGCAGCCA 7S0 

GAGCACTTCT TCTGC TCTCA CTGCGGAGAG GTGTTTGGTG CAGAAGGCTT TCATGAGAAG 840 

30 

GACAAGAAGC CATATTGCCG AAAGGATTTC TTAGCCATGT TCTCACCCAA GTGTGGTGGC 900* 

TGCAATCGCC CAGTGTTGGA AAACTACCTT TCAGCCATGG ACACTGTCTG GCACCCAGAG 960 

35 TGCTTTGTTT GTGGGGACTG GTTCACCAGT TTTTCTACTG GCTCCTTCTT TGAACTGGAT 1020 

GGACGTCCAT TCTGTGAGCT CCATTACCAT CACCGCCGGG GAACGCTCTG CCATGGGTGT 1080 

GGGCAGCCCA TCACTGGCCG TTGTATCAGT GCCATGGGGT ACAAGTTCCA TCCTGAGCAC 1140 

40 

TTTGTGTGTG .CTTTCTGCCT GACACAGTTG TCGAAGGGCA TTTTCAGGGA GCAGAATGAC 1200 

AAGACCTATT GTCAACCTTG CTTCAATAAG CTCTTCCCAC TGTAATGCCA ACTGATCCAT 1260 

45 AGCCTCTTCA GA.TTCCTTAT AAAATTTAAA CCAAGAGAGG AGAGGAAAGG GTAAATTTTC 1320 

TGTTACTGAC CTTCTGCTTA ATAGTCTTAT AGAAAAAGGA AAGGTGATGA GCAAATAAAG 1330 

GAACTTCTAG ACTTTACATG ACTAGGCTGA TAATCTTATT TTTTAGGCTT CTATAGAGTT 1440 

50 

AATTCTATAA ATTCTCTTTC TCCCTCTCTT CTCCAATCAA GCACTTGGAG TTAGATCTAG 1500 

GTCCTTCTAT CTCGTCCCTG TACAGATGTA TTTTCCACTT GCATAATT.CA TGCCAACACT 1560 

55 GGTTTTCTTA GGTTTCTCCA TTTTCACCTC TAGTGATGGC CCTACTCATA TCTTCTCTAA 1620 

TTTGGTCCTG ATACTTGTTT CTTTTCACGT TTTCCCATTT CCCTGTGGCT CACTGTCTTA 1530. 

CAATCACTGC TGTGGAATCA TGATACCACT TTTAGCTCTT TGCATCTTCC TTCAGTGTAT 1740 

60 
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TTTTGTTTTT OAAGAGGAAG TAGftTTTTAA CTC^ACAACT TTGAGTACTG ACATCATTGA 
TAAATAAACT GGCTTGTC-G? TTCAATAJLAA JVAAAAAAAAA MAAAAAAAA AAAAAAAAAA 
AAAAAA 

(2) INFORMATION - FOR SEQ 113 NO: 57; 

(i) SEQUENCE CHARACTERISTICS : 

(A) LEMGTH: 1132 base pairs 
(3) TYPE: nucleic acid 

(C) STRAMDEDMES3 : double 

(D) TOPOLOGY: Linear 

(:ci) SEQUENCE DESCRIPTION: SEQ ID NO: 67: 
CTCAAGGATG TAAAGGCTCT GCAGATTTCG GGAGGCCTGT CTCCCAGCAC CTGATGGGAC 
ACTTTTTGCC CCACTGTAAA TTCTGGGTGT ATCCTCCACT GTATGCTGTC ACCCCAAGGG 
CAAGCACTGC ATCTGCTTAG TGAAGGATTT ATTGTTCGGA AGATACATTT TCQCCTTKAG 
CAGAGAGTGG CGTATCCTGG CAGTCTTGGG TGAGCCAGTT GTACGAGGAT TATGAAATGC 
AGATGTTTAC TGTGTCATTG TTGCTCTCAT TGCTACTGAG GAGTACTGAC OiGA.ATC.iTC 
TGCAACTYTT AGTTGGCAGA GAGGACCACT ATGGCGGGTA GCTCTTTTCT TTCCTGGCAT 
TGTGGGGATG ATTCCAGGCC AAAGATGATG GARAAGTATG GAAATCATCT GAAAGGTTGA 
AGCTTGGCAC GTGAAGCCAT TCATGACTTT GTAAGGCAGT TTTGCTGAAG GCCAGTTCTG 
CCCTGGGAGG GACGGAGGTG AATCCTCCTG AGTACCTGTG GTTTTCTTAC TTCCTGCTGA 
ATTTACCTAA GTGCCTGTTG TTTGCTTGCT GTGGAGGCTT TCTGGTATTT CATTTCAGGT 
GCAGATGCCT TCACTTTCCC ACCFAAAAAA CCCCMACCAA ACCTAAGACC TTACTGCAAC 
TAAGTYTNCC AAGT ACTTTT TAACCCAATG GGATGAACAG CCTGTGGTCT GCTCAGATCA 
CCCTGAGTGC GTGTGAGAAG GCMTNGGCTT TGCCAGGAAA TCCAGGAAGG CAGGGCCGGG 
CTGTGTTGGA AGCTGGCTTA GCTTCGTC-GCG CAGCCTTATT TCAATTAAAA GGGCATTGAC 
TGGGAGCAGC AGTCCTGGAG TTTGTTGCAT TTCCTATTGC CCTCAAAATG AGAAACCAGG 
AAAATAGCAG ATTGGAGCCT TCGAGAAGGC AGTAAATGGC TGTTTTTATT GACAAAAGGA 
AAACATTTTA CTGCCATCTC ACTGATGGCA TCTCACTGAC TTAAAATGAA GGCANGTTGT 
AGTAAAAAAA AAAGTCTACA TTTTTCCACC GCCACGTTCT TATATCCTGT TTGTCAGCCA 
CTGCTCAMAA GGGCATGTTG TCTTGCGGAN TAMAGGCGCT CTCCTTCCCT CGTTTTCCCT 
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(2) INFORMATION FOR SHQ ID NO: 53: 

5 (i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 2433 base pairs 
(3) TYPE: nucleic acid 

(C) STRANDEDNES5 : double . 

(D) TOPOLOGY: linear 

10 

(:ci) SEQUENCE DESCRIPTION: SEQ ID NO: =3: 
AGCAGGCGGT GCGCTGGGGG CGGGAGCAGC GCGKAGCCCG GCTCGGCCAC ACCGATCGCC 60 



15 ' CGCCGCCATG GGCTCCTCGC AAAGCGTCGA. GATCCCGGGC GGC-GGCACCG AGGGCT AC CA 120 



CGTTCTGCGG GTACAAGAAA ATTCCCCAGG ACACAGAGCT GGTTTGGAGC CTTTCTTTGA 130 

TTTTATTGTT TCTATTAATG GTTCAAGATT AAATAAAGAC AATGACACTC TTAAGGATCT 240 

20 

GCTGAAASCA AACGTTGAAA AGCCTGTAAA GATGCTTATC TA.TAGCAGCA AAACATTGGA ■ 300 

ACTGCGAGAG ACCTCAGTCA CACCAAGTAA CCTGTGGGGC GGCCAGGGCT . TATTGGGAGT 350 

25 GAGCATTCGT TTCTGCAGCT TTGATGGGGC AAATGAAAAT GTTTGGCATG TGCTGGAGGT 420 

GGAATCAAAT TCTCCTGCAG C AC TGGCAGG tCTTAGACCA CACAGTGATT ATATAATTGG 430 



AGCA.GATACA GTCATGAATG AGTCTGAAGA TCTATTCAGC CTTATCGAAA CACATGAAGC 540 

30 

AAAACCATTG AAACTGTATG TGTACAACAC AGACACTGAT AACTGTCGAG AAGTGATTAT 600 

TACACCAAAT TCTGCATGGG GTGGAGAAGG CAGCCTAGGA TGTGGCATTG GATATGGTTA 660 

35 TTTGCATCGA ATACCTACAC GCCCATTTGA GGAAGGAAA.G AAAATTTCTC TTCCAGGACA 720 



AATGGCTGGT ACACCTATTA CACGTCTTAA AGATGGGTTT ACAGAGGTCC AGCTGTCCTC 730 

AGTTAATCCC CCGTCTTTGT CACCACCAGG AACTACAGGA ATTGAACAGA GTCTGACTGG . 840 

40 

ACTTTCTATT AGCTCAACTC CACCAGCTGT CAGTAGTGTT CTCAGTACAG GTGTACCAAC 900 

AGTACCGTTA TTGCCACCAC AAGTAAACCA GTC CCTCACT TCTGTGCCAC CAATGAATCC • 960 

' 45 AGCTACTACA TTACCAGGTC TGATGCCTTT ACCAGCAGGA CTGCCCAACC TCCCCAACCT IQ2Q 



CAACCTCAAC CTCCCAGCAC CACACATCAT GCCAGGGGTT GGCTTACCAG AACTTGTAAA 1080 

CCCAGGTCTG CCACCTCTTC CTTCCATGCC TCCCCGAAAC TTACCTGGCA TTGCACCTCT 1140 

50 

CCCCCTGCCA TCCGAGTTCC TCCCGTCATT CCCCTTGGTT CCAGAGAGCT CTTCTGCAGC 1200 

AAGCTCAGGA GAGCTGCTGT CTTCCCTCCC GCCCAOCAGC AACGCACCCT CTGACCCTGC 1260 



55 • CACAACTACT GCAAAGGCAG ACGCTGCCTC CTCACTCACT GTGGATGTGA CGCCCCCCAC 1320 
TGCCAAGGCC CCCACCACCG TTGAGGACAG AGTCGGCGAC TCCACCCCAG TCAGCGAGAA 1330 



GCCTGTTTCT GCGGCTGTGG ATGCCAATGC TTCTGAGTCA CCTTAACTTT GAACCATTCT I44p 

60 
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TTGGAATTGG CGTGGTATAT TTAACCACGG GAGCGTGTCT GGAAACGCAA ACTATCATTA. 1500 

ATTTCATACT AGTTTGTACC GTATCTGTAG GCATCCTGTA AATAATTCCA AGGCX3AAAAC 1560 

5 TAAACGAGGA CGTGGGTTGT ATCCTGCCAG GTTGAGTGGG GCTCACACGC TAGGGTGAGA 1620 

TGTCAGAAAG CGCTTGTATT TTAAACAACC AAAAAGAATT GTAAGGGTGG CTTGCTGCCA IS 30 
GGCTTGCACT GCCGTTCCTG GGGGTC-TGCA TCTTCGGGAA AGGTGGTGGC GGGGCGTCCA ' 1740 

10 

CTAGGTTTCC TGTCCCCTGC TGCTCCTTCC GTAAGAAAAT GAAATATTCT ATGCCTAATA 1300 

CTCACACGCA ACATTTCTTG TACTTTGTAA GTCGTTTGCG AGAATGCAGA CCAOCTCACT 1360 

15 AAACTGTAAA CGGTAAAGAG ATTTTTACTT TTGGTCTCCG TGAGTCGCAT CTCTACTAAG 1220 
GTTTACACAG GAATTCCACC TGAAGACTTG TGTTAAAGTT CTACAGC-GCG CACTGTTAAC . 1380 

TGAACGTCTT TTTCTTCAGC CTATACGCGG ATCCTTGTTT TGAGCTCTGA GAATCACTCA 2040 

20. 

GACAACATTT TGTAACTGCT GCTGTTGCTT TCTACATACA CCTTATAAAG TGACATTTCA 2100 

AAAGAAATAA GGTGCCACAG TTTTAAACCA GAAGGTGGCA CTCTGTGGCT CCTTGTAGTA 2160 

25 TTATAGCTAT ACTGGGAAAG CATAGATACA GCAATAAAGT ACAGTAATTT TACTTTTTTT 2220 

CTTGTGTTA.C ATCTAAATTA CAACCCTTAA TTGCCACGTG TGCACTTACT ACTCTCCAGT 2230 

ATGTCTTATT ACTCTCCAGT ATGTCACGCA TCTTTAACTT TTCACGTCCT ATOTTTGCTT 2340 

30 

TCTCCCATTT TTAAGAGATG GTAAGTTAAC TGGAATTGAT TTACTGAATG AAA.TTAAA.TG 2400 

CAGATATCCC TGTTTTTGAA ATAAAAAAAA AAAAAAAAAA AAAAAAAAAA AAAAAAAAAA 2460 

35 AAAAAAAAAA AAAAAAAAAA AAA 2433 

40 (2) ISFOF STATION FOR SEQ ID NO: 69: 

(i) SZQUEZJCS OiAPACTERISTICS: 

(A) LENGTH : 53 6 base pairs 
(3) TYPE: nucleic acid 
45 (C) STRANDEENE5 3 : double 

(D) TOPOLOGY: linear 

(:<i) SEQUENCE DESCRIPTION :■ SEQ ID NO: 69: 



50 GAGAAATGGA GCTTTGTTAG ATAAAAATTT TTTCAACGCA AACAGTCATT TTCCAGTGAA SO 

AGGAGAGCGT ATCCGCCGTA GGATGGACTT AGATCGTGTA AAAGCTGAGG CCACCGAGGA 120 

TATAACCTCC GGGGTCCTTT GCCTCCTTTT CCTTAGACTC CCTCCAAACT CGTGTATCTT 130 

55 

TCCTTCAGCA GTACTGGGCT CCACGCGAAC CTAGTCCTTT GTCTTTACCC TATTACCTTT 240 

CATAACATCC TAGTTGAAAA GTARTTATTC AACCGCGTTT GAAAATGAGA A.CAGGTTCAC 300 

60 AGARGCTAGG TTACTTGCGA AGGTCGTTCA ATTAGTAACG AGTAACGCCA GGACTGCCAG 360 
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TTTCTTGCTT CCGAATTCTC ATGGTAGCTT TCACCARGCT CCCCGTGlftA TGCTAACGTC 
iLACTACTGAA CTAGATTAGC AAAAAGGTCT TTTAACAGAA TTCCTGGTTT TCAGAGAGAG 
TTTCTTTCAT GAAGCGCCCC ATTTCTACAG AGGAAAATAA ACTCCAAGCA GCCAGT 

(2) INFORMATION FOR SEQ ID NO : 70: 

( i ) SEQUENCE • CHARACTER! ST ICS : 

(A) LENGTH : 355 base pairs 
(2) TYPE: nucleic acid 

(C) STSANDEDNESS : double 

( D ) TO PGLCGY : 1 inear 

. (:d.) SEQUENCE DESCRIPTION: SEQ ID NO: 70: 
CCACGCGTCC GGCCTTTCTT GGCCAGAGGC GCCGGTTGGA CTCACGGGCG GGGCATGATG 
GGTAACAGGA CCGGTGGGGT CCCCAGGAAG TCCTAGAGGG C-GTCGGGGTT TGGGTGGACA 
AGCTTTCCTC GTCCTCTCCC GAGAGAGCTG ACGTGTCCTG GGTTCCACCG GGAGCGGGCA 
TTTCCACCGG ACGGGAGGGT TCGGGGTGTC CGGGGC1GGG GAATACGTAG GGGTTGCCGC 
GCGGTGTGGG GAGTTGGGGC GTGTGGCTGC AGTCCCGGGA GTTCTTGGAG GGGGTCGGCC 
CACCGAGCTT CCGGACCGGC TGATCTGCCC GTAGCTTGCC GGANGGARGG ■ CGGAGCTGAC 




TTCTCCCATC CCCTCCAGTG GTGGGTACGG GCACCTCGCT GGCGCTCTCC 
CCCTGCTGCT CTTTGCTGGG ATGCAGATGT ACAGCCGTCA GCTGGCCTCC 



ACCGAGTGGC TCACCATCCA GGGCGGCCTG CTTGGTTCGG GTCTCTTCGT GTTCTCGCTC 
ACTGCCTTCA ATAATCTGGA GAATCTTGTC TTTGGCAAAG GATTCCAAGC AAAGATCTTC 
CCTGAGATTC TCCTGTGCCT CCTGTTGGCT CTCTTTGCAT CTGGCCTCAT CCACCGAGTC 
TGTGTCACCA CCTGCTTCAT CTTCTCCATG GTTGGTCTGT ACTACATCAA CAAGATCTCC 
TCCACCCTGT ACCAGGCAGC AGCTCCAGTC CTCACACCAG CCAAGGTCAC AGGCAAGAGC 
AAGAAGAGAA ACTGACCCTG AATGTTCAAT AAAGTTGATT CTTTGTAAAA AAAAAAAAAA 
AAAAAAAAAA AAAAAAAAAA AAAAA ■ 

(2) INFORMATION FOR SEQ ID NO: 71: 

(i) SEQOc^rCE CHAPACTEPJlSTICS : 

(A) LENGTH: 92 2 base pairs 
(3) TYPE: nucleic acid 
<C) STRANDEDNESS : double 
(D) TOPOLOGY: linear 
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(;ci)- SEQUENCE DESCRIPTION : 3EQ ID NO: 71: 
TCATCATATA CAAAGTTTTT CGTCACACTG CAGGGTTGAA ACCAGAAGTT AGTTGCTTTG 
AjGAA.CATAAG GTCTTGTGCA AGAGGAGCCC TCGCTCTTCT GTTCCTTCTC GGCACCACCT 
GGATCTTTGG GGTTCTCCAT GTTGTGCACG CATCAGTGGT TACAGCTTAC CTCTTCACAG 
TCAGCAATGC TTTCCAGGGG ATGTTCATTT TTTTATTCCT GTGTGTTTTA TCTAGAAAGA 
TTCAAGAAGA ATATTA.CAGA TTGTTCAAAA ATGTCCCCTG TTGTTTTGGA TGTTTAA.GGT 
AAACATAGAG AATGGTGGAT AATTACAACT GCACAAAAAT AAAAATTCCA AGCTGTGGAT 
GACCAATGTA TAAAAATGAC TCATCAAATT ATCCAATTAT TAACTACTAG ACAAAAAGTA 
T T TTA AATCA GTTTTTGTGT TTATGCTATA GG AACTGTAG ATAATAAGGT AAAATTATGT 
ATCATATAGA TATACTATGT TTTTCTATGT GAAATAGTTG TGTCAAAAAT AGTATTGCAG 
ATA.TTTGGAA AGTAATTGGT TTCTCAGGAG TGATATCACT GCACCCAAGG AAAGATTTTC 
TTTCTAACAC GAGAAGTATA TGAATGTCCT GAAGGAAACC ACTGGCTTGA TATTTCTGTG 
ACTCGTGTTG CCTTTGAAAC TAGTCCCCTA CCACCTCGGT AATGAGCTCC ATTACAGAAA 
GTGGAACATA AGAGAATGAA GGGGCAGAAT ATCAAACAGT GAAAAGGGAA TGATAAGATG 
TATTTTGAAT GAACTGTTTT TTCTGTAGAC TAGCTGAGAA ATTGTTGACA TAAAATAAAG 
AATTGAAGAA ACACATTTTA CCATTTAAAA AAAAAAAAAA ACTNGAGGGG GGCCGGGTAC 
CCAAATCGCC GCATAGTGAT CGTAAACAAT CT 

. (2) INFORMATION FOR 3EQ ID NO: 72: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 996 base pairs 
(3) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY : linear 

(:ci) SEQUENCE DESCRIPTION: SEQ ID MO: 72: 
CGCCTGGCAC CATGAGGACG CCTGGGCCTC TGCCTGTGCT GCTGCTGCTC CTGGCGGGAG 
CCCCCGCCGC GCGGCCCACT CCCCCGACCT GCTACTCCCG CATGCGGGCC CTGAGCCAGG 
AGATCACCCG CGACTTCAAC CTCCTGCAGG TCTCGGAGCC CTCGGAGCCA TGTGTGAGAT 
ACCTGCCCAG GCTGTACCTG GACATACACA ATTACTGTGT GCTGGACAAG CTGCGGGACT 
TTGTGGCCTC GCCCCCGTGT IGGAAAGTGG CCCAGGTAGA TTCCTTGAAG GACAAAGCAC 
GGAAGCTGTA CACCATCATG AACTCGTTCT GCAGG AGAGA TTTGGTATTC CTGTTGGATG 
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ACTGCAATGC CTTGGAATAC CCAATCCCAG TGAGTACGGT CCTGCCAGAT CGTCAGCGCT 
AAGGGAACTG AGAGCAGAGA AAGAACCCAA GAGAACTAAA GTTATGTCAG CTACCCAGAC 
TTAATGGGCC AGAGCCATGA COCTCACAGG TCTTGTGTTA GTTGTATCTG AAACTGTTAT 
GTATCTCTCT ACCTTCTGGA AAACAGGGGT GGTATTCCTA CCCNGGAACC TGGTTTGAGG 
ATAGAGTTAG CAACCATGCT TCTCATTCCC TTGACTCATG TCTTGCCAGG ATGGTTAGAT 
ACACAGCATG TTGATTTGGT CACCTAAAAA GAAGAAAAGG ACTAACAAGC TTCACTTTTA 
TGAACAACTA TTTTGAGAAC ATGCACAATA GTATGTTTTT ATTACTGGTT TAA.TGGAGTA 
ATGGTAGTTT TATTCTTTCT TGATAGAAAC CTGCTTACAT TTAACCAAGC TTCTATTATG 
CCTTTTTCTA ACACAGACTT TCTTCACTGT CTTTCATTTA AAAAGAAATT AATGCTGTTA 
AGATATATAT TTTAYGTAGT GCTGACAGGA CGCACTCTTT CATTGAAAGG TGATGAAAAT 
CAAATAAAGA ATCTCTTCAC ATGAHAAAAA AAAAAA 

(2) INFOHMAT ION FOR SEQ ID NO: 73: 

( i ) 'SEQUENCE CKAFACTERI ST ICS : 

(A) LENGTH: 735 base pairs 
. (3} TYPE: nucleic acid 
<C) STHANDEDNESS : double 
(D) TOPOLOGY: linear 

ixx) SEQUENCE DE3C?-IrTIGN : SEQ ID NO: 72: 

GGCACG^S^GG GCTTTGCGTA ' CACAATAGCT GCTAGGAGTA CCCAAAGCCT GARTACARCC 

TGCTGGTGTC ATGGCCACGT GTGAGCAGGC CAGCGTCAMA CGGCTCGCTG TGACCCGTCC 

CGRAGACTGA AATGGGCCTG GGTCTTCTCC TKGTCCTGTG ATWAAAGTCC TCTCTTGAAA 

GTGGAGAGGA AAGGCACACA GAGGTGCGCG CTCAOAGAA TTCCTCCCGG TGAGTGGGTA 

ATCAATGTTA CTGCTGTTTC CTTTGCAGGA AAGACCAGAG CAAGATTCTT TCATTCGTCT 

CCTCCTAGCC TGGGGGACCA GGCTCGAACT GACCCTGGAC ATCAAAGGAG GGATTATGTG 

GCTGCTAAAG CCATCGGGCC ACAGCCCTGT TCACRTCTTG GTGCTTCTCT TTCCCAGAGG 

CTGGTCCCAG CCAGGCACAC ACAAAAGGCA GATTCTCGTA AACSCAGCCT CCCTCCCTGG 

AGGCTGGCTC CTGCCCTGGA TCTGGAGTGG AGCTGCTCTG AGATTTTGAG TTCTTCTGCA 

GAGATGATTA AATATATCCA AGAGACATTG GAAAACCTGC. TGAAGATTTT ACATTGGTCT 

GCTCAGCAGA TGGCTGGATG CCGATATTTC TATAATTCCA GAAAGTCACA CAGCTCCTCT 

GTATGAGACC AGTGGGCGCC ATTTAAAAGA ACAGGATGAG AATCTAAGAT ATATTATT-AA 

TAAATGTAAT GGATTTTTTT TTTGTAA-AA AAAAAAAAAA AAAAAAAAAA AAAAA-AAAA 
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AAAAA 



(2)- INFORMATION FOR SEQ ID NO: 74: 

(i) SEQUENCE GSAPACTERISTICS : 

(A) LENGTH : 1069 case pairs 
(3) TYPE : nucleic acid 

(C) STRANDEDNE3S: double 

(D) TOPOLOGY : linear 

Cxi) SEQUENCE DESCr-IPTION ; SEQ ID MO: 74: 
TCCTCACCAT TCCCCTAGGN CAGCTCOCTG CAGGTCCCAC ACTTCTCCCA GGTCCCTAAA 
CTTGGGTCGG TCCTTTCCCT GGAGTAGCTG GNTCCTCCAG TCGAGGTCCC TGTTCAGTCG 
GTTCTTAGGC TCCTGCACAT GAAGGTGTGT GCCTGTGGTG TGTGGGCTGC TCTAGGAGCA 
GATACAGGCT GGTATAGAGG ATGCAGAAAG GTAGGGCAGT ATGTTTAAGT CCAGACTTGG 
CACATGGCTA GGGAXACTGC TCACTAGCTG TGGAGGTCCT CAGGAGTGGA GAGAATGAGT 
AGGAGGGCAG AAGCTTCCAT TTTTGTCCTT CCTAAGACCC TGTTATTTGT GTTATTTCCT 
GCCTT T CCGA GTCCTGCAGT GGGCTGCCCT GTACCCTGAA CCTCATGAGC CTCTAAGGGA 
AAGGAGGAAC AATTAGGACG TGGCAATGAG ACCTGGCAGG GCAGAHTACA AGCCCAGCAC 
CAGTGTCCCA GCCTTACTGG GTCCTTACCC TGGGCCAAAC AGGGAGGGCT GATACCTCCT 
TGCTCTTC'GT AGATGCCCAC CTCCTACAAT CTCAGCCCAC AAGTCCTCTC' CACCCTAGGG 
GGCTTGCTGC ATGGCAATAA CTCATAATCT GATTTGGAGG TTTGCCCTTT ACAGGGGCAG 
ATTT T CTGCT CAGTTCAACA ATGAAATGAA GAGGAACTCC CTCTTTCTAC AGCTCACTTC 
TATCAGAGGC CCAGGTGCCT CAGAGCCACA TTGAGTTGCT TTTTCTGGGA TGAGGAAGTA 
GGGTTAAACT CCCCAGTTTC CTGAGGGAGG CTCCTGACAG GTGCCCTTTG- TCAGAC CCT A 
CCACAGCCTG GATAGGCAGC CACATTGGTC CTCGCCCTTG CTCGGMACTC CGTGGTGGTC 
CTGCCCTTCT CCCTGCATGC CTGTGGGTCT GCTCTGGTGT GTGAAGGTCG GTGGGTTAAC 
TGTGTGCCTA CTGAAC CTGG CAAATAAACA TCACCCTGCA AAGCCAAAAA AAAAAAAAAA 
AAAAAAAAAA AAAAAAAAAA. AAAAAAAAAA AAAAAAAAAA AAAAAAAAA 

(2) INFORMATION "FOR SEQ ID NO: 75: 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 331 base pairs 
(3) Tr?E: nucleic acid 
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(C) STEAWDEDNESS : double 

(D) TOPOLOGY: Linear 

(xi) SEQCJENCE DESCRIPTION : SEQ ID NO: 75: 
GGACATTAGA TCACTGTGGA CCTAAAACAA AC-AACAACT ATAAGGAAAA TGGCATTAGA 
AATGGTCTGG GGATCAGTTT ATCACT GCAG TTGTTACATC ACCCCATGGT CTAAAATACA 
GAGCTTTAGT CTGTCTCTGT TTCAGTTCAT TTTACAGGAG GTGAAC AT CA CACTTCCAGA 
AAACTCTGTC TGGTATGAAA GGTATAAATT TGATATTCCT GTCTTTCACT TGAATGGCCA 
GTTTCTGATG ATGCATCGAG TAAACACCTC AAAACTTGAA AAACAGCTCC TGAAACTTGA 
GCAGCAAAGT ACTGGAP.GCT GAOTGATGCC CTCATGATTT TCCACCCTCT CTTCCCATAA 
AGCATCTTC C TAAGGAAATG AMCATGGCCT GATACTCATT TTGTCACTTG TACAGAGCCC 
TAAGGATGTT CTGAATTCAG - TGGTGCCAAA TAAATGTTGA CATTCCCCTT TTGGTTGATG 
GAAGTATCAG TGTGGGAACT GTTTGCTTAA TGGCATTTTA TAAAATAAKA AKAXCATATT 
AGCAGGGAGG GAGATGATGG AGGGAGGGAG AAGTC C ATTT GTCTTATTTA TCCTTTTTGT 
ATTAATAGAG AAGCACTTCA CAGTGACTGG CAATGGCATT TATAGGAAGA AGGTTCTGCA 
TTC CTGCTGC TCCCGGAGGG CTTAACTTTT TAATGAAAGA ATAAATGGTC TTCCACTCAG 
TAGATAAAGT GAAATGTGAA TTGTTAATAA CTGTGCACGG TCAATAAAGG GATGTTTTAA 
GGAATACAAA AAAAAAAAAA AAAAAAAAAA AAAAAAAAAA AAAAAACTCG A 

(2) INFQRMAT ION FOR SEQ ID MO : 75: 

(i) SEQUENCE OiAPACTERISTICS : 

(A) LENGTH: 590 base pairs 
(3) TYPE: nucleic acid 

(C) STRANDEDME3S : double 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: IS: 
TATATATAGA CNGTTAATAG TCGTGANTGN TGTGNACGAA CATTAACGGA AGTAGCATGT 
AGCCAGTCGA ATAACOTATA AGGACAAAGT GGAGTCCACG CGTGCGGCCG TCTAGACTAG 
TGGATCCCCC GGCTGCAGGA TTCGGCACGA GCTGCCAGGT GAGGAGCAGA GAGACTGTTC 
CCTTGGGTGG AGAGGTGTGG GCATGAGAGC CACCCATTGC CAAGCAGCAA GAATGTTCGT 
GCTTTTTTCC CTTCCAAAA.T ATGCAGGGCT CAGGCTCCCA ATTCCGGGCC TGTCTGCTTT 
GCTTGTGTTT CTCCTGTCCC TGTTCTCCCG GAGGGCCCAG GTGGAACTCA CGACAGGGAG 
GGAGACGCTT CCCAAAAACC TGCAGGGCTA TTTCCCAGAA TTTGGTTTTC AAGTACAAAA 
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CTTTTTGTCC TGTAAGATAT AIGCAGCCTC ACAGAAGCAG CCTCTGCCTC CACTTTACCA 480 

GCTACGTTTT TATCTTAAGC ACATGGGGCT CCCTTAGAAC TTACTCCACT GATTTAAAAA 540 

AAAAAAAAAA AAACTCGAGG GGGGGCCCZG TACCCATTCG CCCTAAAAGT 590 

(2) ZMFORMAT IGN FOR SEQ ID NO : 77 : 

(i) SEQUENCE CHARACTERISTICS : 

. (A} LENGTH: 1274 base pairs 
(3) TYPE : nucleic acid 

(C) STHANDEDNESS : double 

(D) TOPOLOGY :"■ linear 

(:ci) SEQUENCE DESCRIPTION: SEQ ID NO: 77: 



GAGCCACCAC ACCTGGCCTG GAAGGAACCT CTTAAAATCA GTTTACGTCT TGTATTTTGT ' 50 

TCTGTGATGG AGGACACTGG AGAGAGTTGC TATTCCAGTC AATCATGTCG. AGTCACTGGA 120 

CTCTGAAAAT CCTATTGGTT CCTTTATTTT ATTTGAGTTT AGAGTTCCCT TCTGGGTTTG ISO 

TATTATGTCT GGCAAATGAC CTGGGTTATC ACTTTTCCTC CAGGGTTAGA TCATAGATCT 240 

TGGAAACTCC TTAGAGAGCA TTTTGCTCCT ACCAAGGATC AGATACTGGA GCCCCACATA 300 

ATAGATTTCA TTTCACTCTA GCCTACATAG AGCTTTCTGT TGCTGTCTCT TGCCATGCAC 360 

TTGTGCGGTG ATTACACACT TGACAGTACC AGGAGACAAA TGACTTACAG ATCCCGCGAC 420 

ATGCCTCTTC CCCTTGGCAA GCTCAGTTGC CCTGATAGTA GCATGTTTCT GTTTCTGATG 430 

TACCTTTTTT CTCTTCTTGT TTGCATCAGC CAATTCCCAG AATTTCCCCA GGCAATTTGT 540 

AGAGGACCTT TTTGGGGTCC TATATGAGCC ATGTCCTCAA AGCTTTTAAA CCTCGTTGGT 500 

CTCCTACAAT ATTCAGTACA TGACCACTGT CATCCTAGAA GGCTTCTGAA AAGAGGGGCA 660 

AGAGCCACTC TGCGCCACAA AGGTTGGGCT CCATCTTCTC TCCGAGGTTG TGAAAGTTTT 720 

CAAATTGTAC TAATAGGSTG GGGCCCTGAC TTGGCTGTGG GCTTTGGGAG GGGTAAGCTG 730 

CTTTCTAGAT CTCTCCCAGT GAGGCATGGA GGTGTTTCTG AATTTTGTCT ACCTCACAGG 840 

GATGTTGTGA GGCTTGAAAA GGTCAAAAAA TGATGGCCCC TTGAGCTCTT TGTAAGAAAG 900 

GTAGATGPAA TATCGGATGT AATCTGAAAA AAAGATAAAA TGTGACTTCC CCTGCTCTGT 960 
GCAGCAGTCG GGCTGGATGC TCTGTGGCCT TTCTTGGGTC CTCATGCCAC CCCACAGCTC ' 1020 

CCAGGAACCT TGAAGCCAAT CTGGGGGACT TTCAGATGTT TGACAAAGAG GTACCAGGCA 1080 

AACTTCCTGC TACACATGCC CTGAATGAAT TGCTAAATTT CAAAGGAAAT GGACCCTGCT 1140 

TTTAAGGATG TACAAAAGTA TGTCTGCATC GATGTCTGTA CTGTAAATTT CTAATTTATC 1200 

ACTGTACAAA GAAAACCCCT TC-GTATTTAA TTTTGTATTA AAGGAAAATA AAGTTTTGTT 1260 
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(2) IMFOEMA.T ION FOR 3HQ ID NO : 7S : 

(i) SEQCJENCE CHARACTERISTICS: 
10 .{A) LENGTH: 113 3 base pairs 

(3) TYPE: nucleic acid 
(C) STRANDEDNES3 : double 
(0) TOPOLOGY: linear 

15 Cxi) SEQUENCE DESCRIPTION: SEQ ID NO: 78: 

AGGATTTTTC CTTGTTCAAC CAAAATCTGA GCATTCTTTC TATGTTGAAA ACACTGAAAA 60 

ACTAATTTWA GTTAATGAAC TAGAAAGAAT ATTGATTTTW AAGAAACAGA AAAATACTAC 120 

20 

TTATTTTCCT TCTCAAATA-A CGTTTCTTTC AAAAACTTCT GGCTGAAGTA TAACATGCTG 130 

GTAGTTAACA TAAATCTTGT CTTTCTCTTG TTCTTTATCT TTCTTTGTTA TTTAGATGCT 240 

25 TGTATAAATG TCTTTTGTTT TTATTAAGTG CCTAATTGAC AGAGCTTAAT TTGAAGAAGT 3 GO 

GCCCTAATTT ATTGACCACT TAAGAATTGC «CTTTATTGGG GTATTTTATT .TGTTCCTGCG 250 

TCTTTTTGAT GTTGTTCAGT CTACTCATCC CTGTGAGTAT GTGTGGGGGA CAGCTGATAG 420 

30 - . . 

AAGGGAGGAG AGTGTGTCTA TGCTCAGGAT TGCCCTTTAG CCACTCAGCC AGAGATCCAC 430 

AGGGAGCAAC AAGGACAGTT TCACATGCTT AGACTTTCTT GGAAGAAACA GTGAGGAGGA 540 

35. GTAAGTCGTG AGTAGTGTCA AGCTGGATGT AGAATTGTCC TAAGGCAGTT GACCCCACCT 600 

TCCAACATGT TTTCACTTTA TTTGCCCCTC CCTACATTTG GGTTAGGTTC CATTTGGATT 560 

TGCAGCAATA ATGACTTTAT TTCTCTCTTG GTCAGGATTT GGCACATAAr. AT CCTTTT AT 720 

TATAGAACTA GCTATTTTAG TTACATAGTA ATGTAACTAA TGGAGAGATT TATAGAGAAT 730 

TTTGKTTTTG CTGTCATATA TGTCCATTTT GGAGACA3AT ATGATAGAAC TAGAAATTAA 940 

GTTGCATTTC TGCAAGTGCC ATTTGAATGA ACTTCAAGTA TCTTCTTAAT TATTAAATTT 300 

TCTGATGAAG GCATTGTAAC AAATATATAG TATTATTAA 1 - TCTAATTAAT ATTTGGAAAT 960 
ATTAATAAAT AGGTATTTTA TTTACTGTAA AAAGTCAAAC TTCATTATGT AGATAAATCT , 1020 

TATTCTTTTC ATTCTTTCCC CTGTTTACAT CCTTTTTACA AAGCTTAGTC ACCAATTAAA. 1Q80 

GCTTTCCTAT CAAAAA?AAA ^AAAAAAAA ACTCGAGACT AGTTCTCTCT CCT 1133 

55 



40 



50 



(2) INFORMATION FOR SEQ ID NO: 79: 
60 (i) SEQUENCE CHAFACTERISTICS : 
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(A) LENGTH: 661 base pairs 
(3) TYPE: nucleic acid 

(C) STRAMDEENESS : double 

(D) TGPOLCGY : Linear 

(:ci) SEQUENCE DESCRIPTION: SEQ ID NO: 79: 

GAATTCGGCA. CGACGGGAAA. AGGATGCTGA ACGAGAGCAG AAAGCCTCTT TCCTTTGCTT 60 

CACGCCTTTC CAGTCTTTAT TTTAAACTCG GGTTCCCTTT CTGTGGTCGC AGGAAC CTTT 120 

ACTCCACCTG CACTGGTGCT CCTGGGGGCT CCCCAGGCCT CCCTCTGCCT TTCTACCCAG 130 

TGGGTGACGG GATGCCTGTC TTGCCTGGAC GCACCACTGC TCTCCTGTCC CTCACCTTGG 240 1 

CTTTTGCTGT GCCCTGCTCT GGGGTTGAAG CTGGC CCATG TGTCCCCCGG- AGTCATGGCT . 3 00 

-GCTCCTCCTG GGAGGCCTCT GTGTGCGTCA CGTCTTCCAC ZSZCTGGGGGC AGCTGGCGAG 3 60 

CCCGTCCTCT GTTCCCCTCG GCTGCTTGGC ACAGAGYTGC AGCCTGGGAY TCTCCGTGGA 420 

CCCAGACTGG GGATTTTGCC AGGGGGGCGA TGGGAGGAGC AGGTGCTTTG CCTGGGGGCT 430 

GTGTCTGCAT TTCTGGACGC CCCAGAGCAC AGAAGTTGCC GGCACTTTGA GGTCTTCCTC 540 

GGCATGTGCC AGATTACATG AGTGACGGCT GGGAATATGT TTTCTTTTTT GTAATGGAGG 600 

CGTGTTTCAC ATATAGTAAA GCTCACCAAA AAGTAAAAAA AAAAAAAAAA AAAAAACTCG 660 

A 551 



(2) INFORMATION FOR SEQ ID NO : 30: 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH : .1373 base pairs 
(3) TYPE: nucleic acid 
(C) STRANDEDNE3S : double 
(DJ TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO : -30: 



ATTGGGTACC GGGCCCCCCC TCGAAGTTTT TTTTTTTTTT TTTTAATGAA AGCTCTCAAA 50 

TAAGCGATTT TATTCCTATC CATGATTGCA GACATTTACA AAACCATAAC ATCTGAGTTC 120 

ACCTTAAAAA ATAACTTATA TAAAGCAGTG ATATACACAG CACAAAATAG TTCAGGGAGG 130 

GGGCAGGAGC AACTTGTAAT AATTAAAATG TAAACGTGAA AAAAAGGATG GAATAAAAGT 240 

CCCTACTTAT TTCTACTTAA GATGTCATGT GATAATATTT TACAATGTCC TGTGGGTCAA 3 00 

TGTATGTATG TGTATATGTC TGTATAACAT ACACATATAC AGTACATTCT CTTTCCCACA 3 50' 

CATATACATA CACACATAAT TATTTGCAGT TCAGTTTAGG GCAATTCTAA TATGCCACTC 420 

CGTACAGTTG TTTGAATCAC ATTTGGACCC GCTTTCTTCA CAAAAGAGGG GAGAGAGCAG 430' 
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GAAATAAAAA GGTTGGTTTG GTGTGACTGA GATTCCTTTG TTTAACTGTA CACTGTGATG 540 

AATAATTTTC TTCCGTAGTA GTTCTGTGAA GGGCTGACTC ACTGTGGTTT TCATGAGGAG 500 

5 ACTTGGTAAT GGATCA.CACG CTCATTGTCA TGCTAGGGGA GTAACTCTCA CTCTGAAAAG 560 

GATTTAAGAA ATTTCCCCGC ATTTCGCCAT CATCCCTTGG AGTGCC CGGT TGATTACTCA 720 

GGCTCATATT ATTGGGAGAA TTCTTGGAAA TACTGTCCAT ATCTCCTGAG CCTAAAGAGC 73 Q 
10 ' 

CATTCATGTG ATGTGACTCC ATTCCTGCTA ATCGAGCCAT GGGACGATCT. GACCCAGGRC 340 

CCATTGGAAA ATTAGGTCTG TTAGGTCCAG GAGGTACTGC ATTCATTAAA GTATA.GATGT 900 

15 TATCACCAGA GTTGGTTGAA TGTGCTGGAC TAGGCATGAT GGGTGTTC CT GGTGGCCGTC 960 

CACCTCCTGG AGGACCTACA TAATTCCCAG GAGATGCTGA GG ACT ATGGT ATTGAATTGG IQ20 

CATTTGTTGG GTTTGGCCAA GGTCTA.CCAC CACCTGGAC G CATGTTCATT CCACGCATTC 1080 

20 

CAGGGGGAGG TAAAJGCATTC * AGTGGGGGTC TCATTGCACG TCCATAGTTC TGTGGTCCTA 1 140 

AGGGCACCAT TCCTCTTGGA GGAGTCATTG TCTGCATTGG CCGACCCATA TTTGGATGTC 1200 

25 CTTGTTGTCG AGTTGGATCC ATTGCACTGG GGAGTAATGG. CTGACTTCCT GGGACACCTC 1250 

CAAGTGCCTG ATTAGGTATG CTCAATGGGG ' X3CCTTGGACC TCCAGGGTAC CGAGGTGACA 1320 

T AAAAGGGT A ATCATGGAAG GGTTTTGCTT CACTTGAGTG TTGACATGTT TCACGTCT 1273 

30 



(2) INFORMATION -FOR SEQ ID NO : 31: 

35 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 1440 base pairs 
(3) TYPE: nucleic acid 
' (G) STPAMDEDNESS : double 
40 (D) TOPOLOGY: linear 

<xi) SEQUENCE DESCRIPTION: SEQ ID MO: 31: 

ACTTTGTCCA AATGTGTCTG TCACATGTAG TCAGCTGNAG NAATTTAAAA- TGAATTGCCA 50 • 

45 

AGTGAAGAGT CTGTGGATTA ATTGGCCGTT AATTAACAGG CTTTATCAAT GTGTCCTCAA 120 

GGGAGAGGCC CAACCCTAAT TAAGGAGCTA AACTTCCTGA GTGAGGGGCT GTGAGGATGG 130 

50 AGGTGGAGGA GGCATCTGGG GCGGGTGGTG GCCGGGGGAG CAGATGGCC-C CTCCCTGGCT 240 

GAGCTGCCCG CACCGCCAGT TCCCTCATTT CCACTCAGGA AGGCAGAGAA GGCAGAGTGA 300 

TCTCCTCAAG GAAGAGCTTC CCCAGCCTTC GGGAGCAGCT GGOC-GGCGT CCGGGAAT^A 360 

55 

GCCCTACACG CCGCCGCCTG CCTCCAACTG ACTAACCCTG CGGCTCTTGT CTTTCAGATT ' 420 

CAACGCGTTC AACAGAAGCC ATCCCCAGCC CAGCTTAAAT TATAAAGATA GACAATAACT 430 

60 CTGTTCCAAT. CTGCGTGGTC CTTCTTTAGT AAATACTGTA CAGATTTTAC CATGGAGAAC 540 
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TTTTTTTTTA GTTTTTACCT TTTCTTAATT ACCCTTATTC CGAATGGACG ^.OjCTTTCT 600 

ACCACTGCTG ACCATTGTAA AATACCGTGT ATATAAATCC CATTGAAATA ATGCCCTGGA 560 

ATAGAACATC TCAAATGCTG CTTAA.TTACA GACTCAGGTC GATTACTTGT ATTTCATGTA 720 

ATGTTCCTCC AAGTT AGAC A TCTGGTGCAA GACCAACCGG GAGAC'CATGG AATTGTCAAA 7 SO 

AGTA.CAAACT GACAGTGTGT ATATTTAATT TAAAGACTTA. TTTAAAAACT CACAAGCTGT 340 

CACCTAGACT TTGGAGAGCA GTCTGTTTTC TGTAA.TGTCT GATACTAGAA ACTAATTTGC 900 

TTATTTTAGT TGTATTCAAG ATTTGAAGAT GTATTTTATA GACAAGTTCT GTTTTTGAAC 960 

TTTGTGGAAC TGTTCCAATC AATCAATTTG CCAGTTATGA TGAGTATTTA CATTATGAAT 1020 

GTATAACCGA GACATGATTT GTAAAGCCGA CAGTATGTTT CTATTACACA ACACTTTT I'G 1080 

ATACAGCGTC TCTTGTCTTC ACTGATACTG GAGTCTCCGT TGTCTGCMWG GTCCCTTCGA 1140 

GTTTCTAGTT ACAGACACAA TCATACTGTG ATTTTATTTT TAATATGGAT ATGCTATCAA 1200 

AGTGTGATAC ACTTATAATT CACTGGTGGT GCATCAGGAG ATGGAGTGGG GAAAACTGTA 1260 

TTTAATACAG TTTGTATCTG AATAATCTGT ATGGTTTATA CAGTTTGTGT TGTTCAGAGA 1320 
TGTTTAAAGT TTGATCTTTG TTTTT CTAAA GATTAAAAAA GCACTTGCCC CACTGTAAAT - 1330 

ATACAGCATG TAAAATTTCT P.TAGTATATA AATGGCAGCA AATCAEAAAA AAAAAAAAAN 1440 



(2) IMFQRMAT ION FOR SEQ ID NO: 82: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 13 3 I base pairs 
(3) TYPE : nucleic acid 

(C) STHAMDEDMESS : double 

(D) TOSGLCGY: ■ linear 

(xi) SEQUENCE DESCRIPTION : SEQ ID NO: 32: 



CCCGGGCTGC AGGAATTCGX lACGAGGCCA GCAGTTGCTC CCAGTTCAGG AGGTGCTCCT 50 

GTACCCTGGC CACAGCCGAA TCCTGCCACT GCTGACATCT GGGGAGACTT TACCAAATCT 120 
ACAGGATCAA CTTCCAGCCA GACCCAGCCA GGCACAGGCT GGGTCCAGTT CTGACCTGAG ' • 130 

CACGGTTTTT CCTCATGTGA CTTCTGGGAA GGCGCTCCCT CATCTGGGCC AAAGGAAGGA 240 

GGACGAAGCC CTCCTCAGCT GGCCTGTGTT TGGGGCATGA ATCTCTCCTC TCCTCCTTGT 300 

CTGGCTCTGT TGACAAACCG GGCATGTTTG GCAGTAAATT GGCACGGTGT CACA.CTGTTT 360 

CCTGGGATTC AAGTATGCAA CCAGAACACA GG AGAAGAAA i^GCTCCAGGA. TCCCTGTCCC 420 

CATCTGTCCT CTTGATGTGA GAGAGACTCT GAGACTTCTT CCATCGCAAT GACCTGTATT 480 
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AAACACAAGC CCCCCAAGCA AAAGAAGAGG TTGAGTTTGC TGGCAGGATT CA.GATCAGCC 
CTTCCCAGGG TCTGCAGGTG TCA.CATGATC ACAGTTCAGC GGGAGGCTTT CCGTACCCAC 
ACTGCCTGTA GCACTTCAGT CCATCTGCCC TCCAGAGGAG GGTTTCTTCC TGATTTTTAG 
GAGGTTTAGA GGCTGCAGCT TGAGGTACAA. TCAGGAGGGA AATTGGAAGG ATTAGCAGCT 
TTTAAAAA.TG TTTAAATATT TTGCTTTGCT AATGTGGTGA TCCGCACTAA GTCATCTTTG 
CAAAAGGAAC TGCTCCCTCG GCGTGCCCCA GGTGGGGCGT CTGAAGGGAT TCCTCACTGT 
GGGCAGCTGG CCTGAGCTTC AGGCAGCAGT GTTCATCTCT GGCCAGTTGT CTGGTTTCCA 
TGTATTCTAG GGCAGGTAGG CAAO.CAGAG CCAAGGCGGG TGCTGGAAGC CAGACGGAAC 
AGTGTTGGGG CAGGAAGGTG GATGCTGTTG TCATGGAGGT GTGGGAGTTG GCACTCTGTC 
TGCTGGTGGG CCTCTCGGCT GACATGTTCA CAGTGCAGGT CCTGGCAGAC TTGGGTTTTC 
TCTTTGGTGG TTTCTAAAGT GCCTTATCTG CAAACAACTT CTTTTCTGCT TCAGGAAGTG 
TGAATGGCTA GAAGAAGGAG CTCAGTAAAC TAGAAGTCCA GGGTTGCTTG GTTTACTOGT 
TTATAkGAAA TCTGAAinGCA CCTCTGACAT TCCTTTTATT AACTCACCTC TCAGTTGAAA 
GATTTCTTCT 'TTGAAAGGTG AAGACCGTGAZaCTGAAAAAA GTGTTGGCCT TTTTGCGGGA 
CCAGATTTTT AAGATAAAAT AAATATTTTT ACTTCTGTCA AAAAAAAAAA. AAAAAAATNT 
C 

(2) INFORMATION FOR ScQ ID NO: 83: 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 1705 base pairs 
(3) TYPE: nucleic acid 
(C) STFANEEDMES3 : double 
<D) TOPOLOGY : linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: S3: 

r 

ACTGCACCAC TGCCCAGGTC TCCCGGCTGG ATGAAGACGT GGTCCATGAG GAAGCTGGCT 
AGCTCAGACT GGAGAGTAGC TTCAGGAAAA AAGACAAGTG GCCTAAGGAA ATCACGGCCC 
CCAACTATCA TCTGAGGGCT AAAGATGAGA AGTAGATCAC TTAATAAGAC AAAAGCCTGT 
- AGGGGGAAAA GAAAGGATGT TTAAAAGGAC AGAA.TGTTTC CCAAC-GTAGA AATGACACTG 
TCAATTTCTC CTTGGAATGG GGGCAGGGAT ACTCGCCTTG TTGCTCCCAC TTGAGT CAGT 
ACTCACCTGC TCCTGGATCT CAGTATCCAC AT CTG AG AGG CAACTCTGGC AGAGTTCACA 
GAAGGCCACC ATTCTGTCCC TC-AACTCGA CAGCTGCTTC TGTGGGCACA GTGGCTTGAA 
GGGGA-.GP.AT GAAGACACAG ACTCCTCTGT TCCCATTATC CCATCT.-AGA CCCACAGTCA 
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CCTGGGGAAG CATCTGATTT AGAAATGTGG GTTAGTGTCC AGAGAA-TGGA AAAA.TAjGACA 540 

AGAGTCAAGG CTGGCAGGAT AACCTGTAAC AACAAAGGGT TTGAAAAATG .AGGTTTGGGT 600 

TAGGAGAGGG AGAGACAGAT AGCCAGAAAC ACACCAGTGA AGAGGA.GA.GA "aAATGAGTAA 650 

AGGGAGAGGT AATTCCTTTT CCAGTGGAAA ATGAGTGA.TA. TTCTGG AGA.T TCTTCAGAGG 720 

CA.TCTA.CACG AAGT AG AAA.T GTCACCGCTC CCTAATTTA.C TCTACGTCTT CTAGAATCCC 730 

" TCAATATTAT CCTTGGCTTC CAGGAAA.TCC AAGAAGACCC TGGAAGTAGA GTCCACCTTC 340 

TAAGAGAGGA ATGTAAGAGG TGACCCCCAC CCACCTGATC TTCCTCGCTT TGTCCACTCC 300 

ACGGACTGAG ACTTGACACA CCTAGTGGCC ACCTAGAACG TAGGTC CTT A AAATYTAGCC 360 

CGCCAGCCCC CAA.CCCATCT CTAGCCTGTC CACTCACCTG GTGAGGAACY TYTCCTGTGT I02Q 

CCACAGCYTT CTGGAGGAGT TGGCAACATG GGTCATAG.AG CTCCCAGCGA GT CAGGTCAT 1080 

GAGTGGTTTG GGGGAGAAAG GGGAATGTTA TACTGGAAAA GAACAGAGGG AACCAACTCC 1140 

ACAGACAC OA GTAAAAACGG GATGGGGAAG AGGAGGAAAG CCACTCA.CTT GTAGAA.GGCA. 1200 . 

GAGAGGCGTT TCAGAGTGGC TGCGAGATTA , TATACCTGAT CCTCATCT.AG GAAGGACGAC . 1250 

TGAGAAGGAA AGAAGA.TCCA CAATA.GCATT TCCCCCAGAA. CTCA.TCAGTC CACATCCCCC 1320 

GTCTTGCAGC CCCTCCCACC CTTGTTTGGG GTGTCCCATT GTCCAGCCCC AGCTCCTACC 1330 

TGTAACAGGT CTTCAAGCTC CTGCTGGAAF. CGGTGAGTCA. GCAAATCTAC TAGCTGGCTG 1440 

CGGGCAAAGT CCGCCGGGGT GAAGAAAGTG AATTCGGGAT TAGA.GAGGAG GTAAGA.GCAT ' 1500 

GCGCCCCAGC CTCAAGGACC GCTGGCTCTG CATGGTTCAC CACCAGGTCG TGGAGTTGCT 1560 

GGAGGAA.CAG CTCCAGGTGC TGAGAAGAAA AGGCAGAAGA TGGTGTGGTG TGGGGATCGG 1520 

AGGAGGACAG TCTTCTGGCG GGAAGTGGAA. GGGGGTTAAA AGGA.TTAAAG TTGAAGGATA 1530 

AGATGCCTAA PAAAAAAAAA AAAAAA 1706 



(2) IMFOHMAT ION FOR SEQ ID NO: 34: 

(i) SEQUENCE CHAFACTEH.I ST IG S : 

(A) LENGTH: 573 base pairs 
(3) TYPE: nucleic acid 

(C) STPANDEDNESS double 

(D) TOPOLOGY : linear 

(xi) SEQUENCE DESCRIPTION : SEQ ID NO: 34: 
GAATTCGGCA CGAGCTTGGT AGCCTTAGAA CTGCATGAGC TGCTTTAGCA CTGGGAAAGA 60 
CGAGCACAGC CTAGCTTGAT TTTGTATGTG GTATCAGATC TAAGGTGGAT GGAA.TTCAGG 120 
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ACTTCCTGTC TACTCTTTGA TTTTGTTTTA TTTTTA.GAAA TGTTTTATTT TGTTTTATTC 
ATTTATTCAT CTTCAGAGAC ATGGTCTGGC TCTGTTGCCC AGGA.TGGAGT GCATGGTGTG 
ATCATAGGCC ACTGCAGTGT TGAGCTCC CG GGCTCAGGCG ATCCTCCTC-C CTCAGCTYCG 
TTAGTAGCTG GGACTATAGG CAGATGCCCT ACCATGCCTG CGTTTG7CTA. CTTTTTGAAT 
GATGTCYCAA ACTAGAAGGT CTA.TTAATTT AAAAAATTAA. GGATAGCATG CC ATAATTA A 
AAATAATAAC AGT GGG AAAA GGCAC CTTC C AATGATTCAG ACATGAACTT GTGATTTAAA 
AAAACGAAAA ATAAATAATA. GGAAAAAAAG GGGAAAAAGT TAAATAAAAA TAAAATTAAA 
AAAAAAAAAA AAAAACTCGA GGGGGGCCCG GTA 

(2) INFORMAT ION FOR SEQ ID NO: 35 : 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 634 base pairs 
(3) ty?h : nucleic- acid 

(C) STRANDEDNESS : double 

(D) TGPGLCGY : linear, 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 35: 
GTCTTTGGGT GTGTCTACCT CCTTCATCTG CTGCGCCGAC ATAAGCACGG CCCTGCCCCT 
AGGCTCCAGC CGTCCCGCAC CAGCCCCCAG GCACCGAGAG CACGAGCATG GGCA.CCAA.GC 
CAGGCCTCCC AGGCTGGTCT YCAGGTCCCT TATGCCACTA TCAACACCAG CTGCYGCCCA 
GCTACTTTGG ACACAGCTCA CCCCCATGGG GGGCCGTCCT GGTGGGCGTC ACTCCCCACC 
CACGCTGCAC ACCGGCCCGA GGGCCCTGCC GCCTGGGCCT CCACACCCAT CCCTGCACGT 
GGCAGCTTTG TCTCTGTTGA CAATGGACTG TACGCTCAGG CAGGGGAGAR GGCTCCTCAC 
ACTGGTCCCG GCCTCA.CTCT TTTCCCTGAC CGTCGGGGGG CCAGGGCCA.T GGAAGGACCC 
TTAGGAGTTC GA.TGAGAGAG ACCATGAGGC CACTGGGCTT TCCCCCTCCC AGGCCTCCTG 
GGTGTGATCC CCTTACTTTA ATTCTTGGGC CTCCAATAAG TGTCCCATAG G7GTCTGGCC 
AGGCCGAGCT GCTGCGGATG TGGTCTGTGT GCGTGTGTGG GCACAGGTGT GAGTGTGTGA 
GTGACAGTTA CCCCATTTCA GTCATTTCCT GCTGCAACTA "AGTCAGCAAC ACAGTTTCTC 
TGAAAAAAAA AAAAAAAAAA AAAC 

(2) INFO RMAT ION FOR 3EQ ID NO : 36: 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH : 1038 base pairs 
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(5) TYPE: nucleic acid 

(C) STRANDEENE33 : dcubis 

(D) TOPOLOGY : Linear 

(xi) SEQUENCE DESCRIPTION : SZQ ZD MO; 86: 

TGGAGGCAGA ,TGCACAGGAG AAAGGTTCCC GTCCGCACCC TCTCAGACCT GAGGCTGAGC 60 
TTGCAGTGAG GGCTTCTCCT CGGCCCCTCG CCCGCCCCCA GAGCTGCCAT CCCTGCTGTT ■ 120 

AOA.GCCAGA GGAGCCCGGA TGTGAGGCCC CAGATCACCT CCAGGGACTT GGGGTTCCCA 130 

TCTGAAATCC TTTATTTTTG TACCATGGGG TGGGCCCCGG GCTGAGAAGG AAGAA.GCACC 240 

GTCTCCCCGG- CCTCCTCTGT GTGCAGCCGT GGGGCTGTGA CTTACTCCTG GCTCCAGGGG 300 

CGGGGCGGGG CCCCCTGGGA GCTCTTAAGG CCCAAGGTGG GCGCCAGC-AC CTYTO^GCAJG 3 60 

AGTGGAYTGC TC^TGGCAGA TGTGTGGCAA TGTCTGGCTG WGTCTTTCCG GCAMCTGCG7 420 

YCCGTYTCGC GGGYTGCC-CT GCTGCATGGT GGATGTGCTC CTTCCTGGCC OGGTCAGATT 430 

GCCTCCTTGA GCCTTAGTCC AGGGGGT C AC TYCTCCCACC CCACCTACCT CACAGGGTTG 540 

TTGTGAGGGT GCACAGAGGA GCAAAGTCCC TGAAGGCCCT C\GGCA.GTAT ATAGGGGCCG 600 

CCCACCTTCA GCTGCCCTGG GATGGGAAGd' ACCCAjGCCCG ACCCCTGGGC ATAACACTGT 660 

GTTTGCAAAT GGAG.ATTC.AG GTATTGGGGA TGCAGGTTGT GGGGAGCTGG CCTGGCAGAG • 720 

TAGGGGTAGT TGGCTTGGCC TTCTCTTTGG TGATCCCACC CCCAGCCATT TGCATTGCTG 730- 

GCCCAGCGCC TGGCCTGGGG GGCGGGGAGA GGCAGCAGAA GGGGCTGGGC AGGGGCGGTG S4Q. 

GAGGACTCAG GAACTGCCCG GGGAGAGTGG GTATGGCGGC TGAGCCAGGG GCCCTCGTGT 900 

• GTTTGACTTC CGGGGATGGG TCCTTGCTTC TCAGCTGTGT CCGACCCCAC OiTGTAATAA 960 

AACCCAAAGG AAOjCXTAAAA AAAAAAAAAA AAAAAAAAAA AAAAAAAAAA AAAAAAAAAN 1020 

CCCNGGGGGG GMCCCG 102 6 



(2) INFORMATION FOR SZQ ID NO; 37: 

(i) SEQUENCE a€ARA.CTE?_TSTICS: 

(A) LENGTH:. 90S base pairs 
(3) TYPE: nucleic acid 

(C) STRANDEDNES3 : double 

(D) TOPOLOGY: linear 

ix±) SEQUENCE DESCRIPTION: SEQ ID NO: 37: 

TTAAACAAAT GGAATCATGC AATATGTGAC CTTTTGCGTC TGGCTTATTT TATTTAGCAT . 60 

AATGTTTTTG AGGTTCATCC AAGCTGTAGC ATGTATCAjGC ACCTCATTTC TTTTTGTGGC 120 

TGAATATTAT TCCATTATAT C-GATTTACCA CAATTCATTT ACCTATTCA.T CTTTTGTTTC 130 
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TGCTGTCTGG CTATTGTGAA TAATGCTTCG ATAAACA.TTC ATATACAAGT TTCTATGTGG 
CTTTA.TGTTT TCATTTCTCT TGGCTATCTA CATGGGAGTA GAATTC TAGG TCATAATATA. 
ATTTTATGTT TAACTTCTCA AAGAATTGCC AAAAGGTTTT TCATAC-TGGC TGCA.TCATTT 
ACATTCCCAC CGGCAATGTA CAAGG ATTT C TATTTTTCCA TATCCTTGCA CTTACCAACA 
CTTCTTTTTK GTWATWftTTT TGTTTTTTCA TTATTGCCAC CCTAGTGGAT GTGAAATGGG 
ATCTTATTGT TTTGATTTGG ATTTCTCTAA TGACAAATGA TATCATACTT TTTTTATGTG 
CTTACGGATC AAAGGTATTT CCTTGGAGAA ATGTCCCTTC AAGTCCTTTG CCATTTCAAA 
ATTTGGTTAT TTGTCTTTTA TTATTCAGTT TTAAGAAATT CTGGCGAGGC GCAGTGGCTC 
' ACCTGTAATC MTAGCACTTT GGGAGGCCAA GGCGGGCAGA TCACTTGAjGK TCAGGACTTC 
GAGACCAGCC TGGGCAACAT GGTGAAACCC CATCTTA.CTA AAAATACAAA AATTAGCTGG 
GCGTGGTGGC AGGTGCATGT AATCMTA.TCT ACTCAGGAG3 CTGAGGCAC-G AGAATCGGTT 
GAACCCAGGA GGCGGAGCCT GGAGTGAGGC AAGATCACGG CATTGGACTC TAGGCTGGGT 
GACAGAGA 

> ' ; ' 

(2) INFORMATION FOR SEQ ID NO: 33: 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 55 5 base pairs 
(3) TYPE: nucleic acid 

(C) STRANDEDNE3S : double 

(D) TOPOLOGY- linear 

(>:i) SEQUENCE DESCRIPTION: SEQ ID MO: 33: 
TGCACTGGTT CCTTCTCCCC AGCAAATACT GCCTTCTTGT TTTTCTCTGA TGTGGGAGGT 
GACTACAAAA TCCGCCTTGG TATTCTTCAA ATGCATATAT ATTCCTTTCT TGTCAGCTCC 
CTCTGTTCCT AGATTAGAAA ACTGCCTCAT TTTCTGCTCA CTGGATGTGC AGTCCCAGCT 
TGTCTTCCTC TCCTCCCCCC CTGTTGCAGG TGTTCTTTTT 'TTTTTTCTTC TCTCCCCACT ' 
GGGCAGCAAA AGTTGTTCCA CAGTGGAAAW TTAGGCATCC TCAAGTTTCY TCCCAGCTTC ' 
TGCTGTGTTT TCTTAGAGTA AATTGCCAAT TTCTGTTTTT ACAGGAAATC CTTTTTTAAA 
AATGGAATCA GTGTGGTCCC CATCTACTCT GCAAAAATTG CATTTTTCTC TATTTTCAAA 
TGAGATTTGT TCAAGTTTCA AAAC CA.CGTG AAATAATAAA TGTATAGTAG TTTTCTTTTC 
CTTGGGCATT GCTWGATATG TGAAATGGGT TTATGAAAAA TAATAAAATC ATAACGGTAT 
TTGTTTGACT TTCAATTTCA TGGGAATTTT TCTCAGCTAA ACTCTAAATG GTGATTARGC 
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AAAAAAAAAA AAJ^AAAAA.CY GRAGGGGGGC CCGGTACCAA TTCGCCCTAT AATGA 



(2) IMFOH11ATICN FOP. SEQ ID NO: 39: 

(i) SEQUENCE C: J JW-CTERI3TICS : 

(A) LENGTH: HQ 2 base pairs 
(3) TYPE: nucleic acid 

(C) STRANDEDNE3S : double 

(D) TOPOLOGY: linear 

(l-ci) SEQUENCE DESCRIPTION: SEQ ID NO: 39: 
TTTTT TT T TT ACCATTTAAA ATAAAATGAA AGTGACCTTC TGTTTA.TAAA AATCTTTGTC 
TGCATCTCTG CTTATTTCCT TAGAAGAGAT TCCAAGAAGC GGTGAGTGAT TTCACGGCAG 
CAGAGGGTTG GGACATATTA CGGGCGCGGA TCCCTCTTGG AGTGAGA.TGA CTCTCCGGAG 
AGATTTAGTC GTCACCCTCG CCTGTGAGGC TGCGTCACAC CC CAGGGATG TGTCTATCAA 
GATGGAAGAT CTTTTACACG CTCTTGATTT TGTTTG3CTV TTTTTCTATT ACTAGTGAGA 
AKGAAACTTT TTATATGATT ATTATCCATC ATAATCCAAC ACAAATTACT GCTTCATGTT 
CTTTTACTTT CCTGTGAAGG TTTTAGTGCC TTTTAAAAAT TGCTATATAT TAAGCTTGTT 
AATACTTCCA TGCTGTATTT GTGGSCATCA RTTTCCCCGG GMACAGGC*iT GGACATTTTG 
CCTTCACACG CTGGGTGGTT TTTCATTTTC Ai-rXTCTATTT CTCGTTCTTC TAT CGTTTTA 
TGTTCAGACG GGTTTCTCCG TGTAGAAAGC AGTTTATGAA GATTTACTTT CGACAGTCTT 
CTCTCTACTT TCTACAGTGA ATTCTCTGAT GTGTCTGGGA GTTTGGGGGT CTGGGTAAGA 
RTCCTCCTCT CACCCTATTC TCTATTACGA TCCACAGCCT CATGCTTTAT GARATTGGTG 
GCCGGGARCG GGGGAG ATTT' GCGGATCCCC CAAGCCAGAC TTTATCCCCC TATCCCTGCC 
TCTGGATCCC ACGTACAGGC CTGGGAACTC CCTGTGGGTA GGGGCCAATG GTCTCGCACT 
CTCACCTGTA CCCCAGGGCT GGCACAGGAT GGTCAAGGAG AGAGGCTGCC CAAGCGCATC 
CYTCTGGTGT CCCCCTGACA CGCCTCCAAA GTGAGCAGGT AGGTTTCAAC AGCCCCACGT 
TGCAGGTGGG AGATGAAGCT CAGGGTGGAG ACCAGTATCT CACAGTTCTC TTTGCATGGC 
CGGGTACTTG TTAGTCAACT GATCAA.GTGA AAATTCTAGC CCCAGAGGCA GGAGAATCCG 
GAACAAAATT AAACCAGCCA GG 

r 

(2) EMFORMATIGN FOR SEQ ID MO: 90: 



(i) SEQUENCE CHARACTERISTICS : 

(A) L52tfG7K: 153 3 base pairs 
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(H) .TYPE : nucleic acid 

(C) STRANDEEaiESS : double 

(D) TGFOLCG* { : Linear 

(;ci) SEQUENCE DE5C£I?TION : SEQ ID NO: 90: 

GGCACGAGCC GNCACGGGCA GCGCCCCATA GCGCCAGGGA CCCOCTG3CA GCGGGAGCCG 

CGGGTCGAGG TTATGGATCC AGCGGGCGGC CCCCGGGGCG TGCTCCCGCG GCCCTGCCGG 

TGMCTGGTGC TGCTGAACCC GCGCGGCGGC AAGGGCAAC-G CCTTGCAGCT CTTCCGGAGT 

CACGTGCAGG CC CTT TTGGC TGAGGCTGAA ATCTCCTTCA CGCTGATGCT CACTGAGCGG 

CGGAACCACG CGCGGGAF.CT GGTGCGGTCG GAGGAGCTGG GCCGCTGG5A CGCTCTGGTG 

GTCA.TGTYTG GAGACGGGCT GATGCACGAG GTGGTGAACG GGCTTCATGG AGCGGCCTGA 
CTGGGAGACC GCCATCCAGA AGCCCCTGTG TAGCCTCCCA GCAGGCTCTG GCAACGCSCT 

GGCAGCTTCC TTRAACCATT ATGCTGGCTA TP-AGCAGGTC ACCAATGAAG ACCTCCTGAC 

CAACTGCACG CTATTGCTGT GCCGCCGGCT GCTGTCACCC ATGAACCTGC TGTCTCTGCA 



CACGGCTTCG GGGCTGCGCC TCTTCTCTGT GCTCAGCCTG GCCTGGGGCT TCATTGCTGA 
TGTGGACCTA GAGAGTGAGA AGTATCGGCG/TCTGGGGGAG ATGCGCTTCA CTCTGGGCAC 
CTTCCTGCGT CTGGCAGCCC TGCGCACCTA CCGCGGCCGA CTGGCCTACC TCCCTGTAGG 



AAGAGTGGGT TCCAAGACAC CTGCCTCCCC CGTTGTGGTC CAGCAGGGCC CGGTAGATGC 
ACACCTTGTG CCACTGGAGG AGCCAGTGCC CTCTCACTGG ACAGTGGTGC CCGACGAGGA 
CTTTGTGCTA GTCCTGGCAC XGCTQCACTC GCA.CCTGGGC AGTGAGATGT TTGCTGCACC 
CATGGGCCGC TGTGCAGCTG GCGTCATGCA TCTGTTCTAC GTGCGGGCGG GAGTGTCTCG 
TGCCATGCTG CTGCGCCTCT. TCCTGGCCAT GGAGAAGGGC AGGCATATGG AGTATGAATG 
CCCCTACTTG GTATATGTGC CCGTGGTCGC CTTCCGCTTG GAGCCCAAGG ATGGGAAAGG 
TGTGTTTGCA GTGGATGGGG AATTGATGGT TAGCGAGGCC GTGGAGGGCC AGGTGCACCC 
AAACTACTTC TGGATGGTCA GCGGTTGCGT GGAGCCCCCG CCCAGCTGGA AGCCCCAGCA 
GATGCCACCG CCAGAAGAGC CCTTATGACC CCTGGGCCGC GCTGTGCCTT AGTGTCTACT 



TGCAGGACCC TTCCTCCTTC CCTAGGGCTG CAGGGCCTGT CCACA.GCTCG TGTGGGGGTG 
GAGGAGACTC CTCTGGAGAA GGGTGAGAAG GTGGAGGCTA TGCTTTGGGG GGAGAGGCGA 
G AATGAAGT C CTGGGTCAGG AGC CCAGCTG GGTGGGCCCA GCTGCCTATG TAAGGCCTTC 
TAGTTTGTTC TGAGACCCCC ACCCCACGAA CCAAATCCAA ATAAAGTGAC ATTCCCAAAA 
AAAAAAAAAA AAAAAAAAAA ANCCCGNGGG GGG 



SO 
120 
130 
240 
300 
360 
420 
430 
540 
• 600 . 
550 
720 
730 
840 
900 
960 
1020 
1030 
1140 
1200 
1260 
1320 
1330 
1440 
1500 
1533 
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(2) I3STFORM=,TION FOR SEQ ID NO: 91: 

(i) SEQUENCE CIw^jCTERISTICS : 

(A) LENGTH: 573 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEEMES3 : double 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 91: 
ATCCTCTGGA ATCTAGGTGG AAGCCACCAA GCCTTCTTCA CACTTGCGTT CTG-GCATCT 
GCAGACTTAA CCCCATGTGG CAATCACCAA GCGTTATGGC TTGTGTCCTC GAGAACTGTG 
GCCAGAGCTG TACCTGGGCC CCTTTGAGCT GAGGCTGAAG CCAGAGTCTG AAGCTCAGCA 
GGGCAGTARG GCCCTGGGCC TGGCCCCTGA AACCATTCTT TTCTCCTAAG CCTCTGGGCC 
TTTGATGGGA RGGGCTGTCC TCAAGATTTT TGAAATGCCT TTGGAGGGTT TTTGCCTTGT 
CTTGGATATT GGCTTCCTTT TAGTTATGCT CATCTCTCTA GCAAGTGAAT GTTTCACAAC 
CTGCTTGGAT TCTTTCTCTA CCACAGARCG AGGCTGCAAA TTTTACAAAC TTTTACACTC 
TGTTTCCCTT TTAAATATAA ATTTCAATGT TAAGTCACTT CTTTGCTCCC AT ATCTGATT 
TAGGTTGCTG GAAGTAGCCA AGTCACCTCT- TGAATGCTTT GCTGCTTAGA AATTTCCTCT 
ACTAGGTAGC CTGGGTCATC ACACTTAAGT TCAAA 

(2) XNFGRMAT IGN FOR SEQ ID MO : 92: 

(1) SEQUENCE CHARACTERISTICS: 

(A) LENGTH : ' 639 base pairs 
{3) TYPE : nucleic acid 

(C) STPANDEDNESS : double 

(D) 'TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID MO: 92: 
TCCTTTCATC TTAAGCACCA CCCGACAGGG CAGGTACTAT TACCATCTCC GTTTGAC-iGA 
TMAGGAACCT GGCACAGGAA GCATTTAAGT GGATTCCCCA GGATCGCCCC ACTGTCAGGA 
GCAGANTCAG AATGGGCCTC AGCATCAGGC TCCCAATCCT GGCTTCTAAC TGCTGCGCTC 
TGCCCTTCYC TCWCCCCACC TCCCCACTCC AGTGCCTTTG- GTCATGCCAC TGCAGCTTTC 
AGGCCAATAC TGGATTAGCC TCTTAGTGTT CTTGTCCCTG CAGCCATTTC CCC.-GGCAGC 
AATTCCATGT GCCCTCACTG ATGTAGGTGG CTCTTGTGTC ATTTGTCACA TCCTATTGAA 
TTGTTTATGC ATCTTGTTCA CACTCACAGC ACCCTCCCTC TCACACGTCC TC CTT ATAAA 
AATGTCCCTC AGTGTCTGCT ATGAGCCAGG TGCAGACTTA AGTGAOGGG CTGCTACGGG 
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AAATAAAAAA TTA-ACAA.GGA GCACCTGCCT CTTAATGCAC AGTAACAAAC TA.TGTTAA.GT 540 ' 

GTCAGGAAGG AAAGGTTAAG GATGC CAGG A AGGCTTTTAA TAAATAA.CCT G.ATTTAGATG 500 

GGCAGGTGGT GCTGAFGA.TT AAGAACGTGT TCTTC7CGA g 35 

(2) INFORMATION FOR SEQ ID MG : 93: ^ ~~ ! — 

{ i) SEQUENCE CiARSCTEP-ISTICS : 

(A) LENGTH : 744 base pairs, 
(3) TYPE : -nucleic acid 

(C) STHANC'EDNESS : double 

( D ) TOPOLOGY : 1 inear 

(:-:i> SEQUENCE DESCRIPTION: SHQ ID MO: 92: 

GAATTCGGCA CGAGAGTGGC TGGAGTCTGG CTGCAGAGGG AAGACATCAG CAGGGAGGGA 60 

GCCAGGGCCT GTCACATCTT TCCTCTGGCC ATTGTCCTGG TCTTTGTAAG CCCAGAATCT . 120 

CCCCTTCCCT GAAGGGAGGC CA.GCACCCCA GGAGGGCAGC AGGTGTGCTG TGAGGGTTGG 130 

AGTAGTGTGA GAGGTCAGGG TACACTAGAA, TGGCCATGGA CACCATGTGG GGGTGCTCTG 240 

GGCTGGGCCA CAGAA.CAGTG TCCTTCCTGC TGCTCCTCCC CTGCAGCTTC CCCCGACCTT 3Q0 

GTNGTTTATT TGGTTTGATA CCAATCAGCA GACCCTGCAA GGTGGAAGCT CCCAG3CTCT 360 

CAGTCCCA.CS ACTCTCATGT GCCAGTCACC CNTACTGTAA CTGCCCAATG AGTACTTCTT 420 

GCCCACTGCC AAGATAGAGC CAGTTTACCA AGACAGGGGA ATTGCAGTAG AGAAAGAGTT 480 

GAATATACAT AGAGCCAGCT AAATGGGAGA GTGGAGTTTT CTTATTACTT AAATCAGCCT 540 

CCCYTAAAAT TCAGAGGTGA GAATTTTTCA AGGACAGTTT CGTGGSCAGG CCTAGGGAAT 600 

GGATGCTGCT GATTGGCTAG GGATGCAATC ATAGGGGTGT AGAAAAGTWC CTTGTGCACT 660' 
GAGTCCACTT TTGGTGAGAG CTACCAAGGA GCTGCTGGTC TGCTGGTCCC GGTAGAGCCA ' 720 

TCTGGTGTCA GGAATGCAAA AGTG 744 

(2) INFORMATION FOR SEQ ID NO: 94: 

(i) SZQUZXCZ CHARACTERISTICS : 

(A) LENGTH: 525 base pairs 
(3) TYPE : nucleic acid 
( C ) STPANDEDNES 3 : double 
-(D) TOPOLOGY : linear 

<xi) SEQUENCE DESCRIPTION: SEQ ID MO: 94: 
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AAGCCCATAA GTGCTGGCCT GTTGGGACAA ATGAGAGAAA TCCCATAGGG TGGTGATGAC 

AGCC-C-ViTCA gccatcttay tcctggggaa aatgaaactt gtgctcctat C-AATGCTCA 

GTTGTAAAAC TGGAAAAAAA TTTTAGAAGA CVTCTTGTCC AGCATCTGTG TTTATGTCTA 
TAAAATGTAG AAAACTAAAG CACAGAGATG TTAAATGTTT TGTCCAAGGT GCAACAGCTG 
GTTAGGARGC TTGGTCTGGT GACCTTTCTA CTGAACCACA GTC-CCGCTGG GGGAAGTCCT 
CAGCACJ^.GAT GGCTGCTGCT ATAGCTGGGG TATGGGCAGT ATTAGTAGTT AACCAGTCAA 
CCCAAGTTCC CA-TAGTCTAG GTTCTGCTTC AGCTGGAGGT TAGGGAAAAA CACAAGAAAA 
TCCCTTACCA CTCTACCAGT GCTGGGGGAT GTAGTAAGAG ATCGCC 

(2) INFORMATION FOP. SEQ ID NO: 95: 

(i) SEQUENCE O-APACTSP.ISTICS : 

(A) LENGTH: 425 case pairs 
(3) TYPE: nucleic acid 

(C) STRAMDEDNESS : double 

(D) TOPOLOGY: linear 

(;ci) SEQUENCE DESCRIPTION:" SEQ ID NO: 95: 
GGCACAGGGC AGGAGAGACT TGGTCCATGG GGAGAAGCCT GCAGTATAGA TGGGACCTCC 
AGGAGCCCAA GTAGCATAGA CCCTGCTGAT CCGGGGCCAT TGAGCCAGAG GATTTGGGCT 
GAATGTCCCC AGAGACAAAA. GGGAAAGGTA GATCCTTTCC CTTAAAGATG AAAGCCATCG 
CCCGGGCTTG CTTATTGCTC TCTCTCCTGG TCCTTCCACA TGTTGTTTCT GAACATTTG' i' 
TCTGGCATCA CAATCCCCGT CATCCTGTCA TCTGGCCCTT CCCACCTTTC CACCTTATCT 
CTTGCAGTGT CTCCGCGTCG ACCTOGCACC TGGGTGAARG CTTGCTCTTG CTGGTGCCCA 
TAGCCCCCAG TGTATGGTCT TGAHCTCCCC AGCCATATGG ARACCCACCT CAGGAGGGCC 
CCTCGA 

(2) UIFORMAT ION FOR SEQ ID NO: 96: 

(i) SEQUENCE Oi^PAjCTERISTICS : 

(A) LENGTH: 344 base pairs 
(3) TYPE: nucleic acid 
(C; STRANDEDNE3S : double 
(D) TOPOLCGT { : linear 

(xi) SEQUENCE DESCRIPTION : SEQ ID NO: 96: 

GOCAC^GCOG CACGAGATAG GAAGCTTGGC AGGGGCAGCT CCCCCAGTGC GCATTGCCCT 
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GTAACTCGAC CGCCTGGGAG TGGGGAGAGG CTTGGAAATG GAGCAGGGTG GTGGACCTCG 120 

TCTTCTCCTG CTGATCCCAG GCCTCCTCCA TAAO-CCTAC CTAGCAC GGC CTGGGGACTT 130 

CCCAGCG-GAA GGAACAACTG AGAATACTGA GTGCCAGGGT AGCCCTPJGCC CCATTTCACA 240 

CCTGGGCAAA GTGAGGTCAC TGGATTCAAA CACTCAGATT TAAACCTCCT CTGTGTCTGC 300. 

AGGACCTGTA TATAACTGCC AGCCTCTGCT GCCCCTCTCC AAAAAGTCTC TGCCCTTGTC 360 

TTTGGCACGT GTCTCTGTCC TCCCCATTCT CTGCTCCTCC TTTC7CCAAC TCAGANTCAG 420 

CCTGTTAGTT C-X3CAAATGT TCATCGAGCT CGATAATGTA GCAGGACAGG MCTGTCTAAC 430 
AGATTCTGGN CTTGCAAGGG TGAGACAAGT ACTCTGCATG TTTCT GTCAT CTTCACAGAT . 540 

GGTGTGCTCA ACAACTTTGC AGTGAATTGT AAkTA^ITGA TACTGCATAA AACATTGATG 500 

TTCTTTAACG GTAGTGGAGC APOGTGGCAA GTCTTATAAT GATAACTGCT CAAGGATGTG 660 

TCAGTGAAGC ATTTGGGGST GCTAGCTCTG CCTATGGGTG AGGTCAGCTA TCTCACGCCA 720" 

TCTACTTCCA COTGCCCCCC CATGCCAGGC TCACCCTGAG CTGAGATGCC TGAGCAGGTG 7S0 

GCAGAAAGGA GCCACCTGGT TTATGCTTCG GGACCACAAA CTCCTCTATC CAG^IGACAG 34 Q 

TTTT 844 

(2) IMFOP^IATION FOR SZQ ID NO: 97: 

(i) SEQUENCE CKAPACTE? LI5TICS : 

(A) LENGTH: 1935 base pairs 
(5) TYPE: nucleic acid 
(C) STRANDE3NESS : double 
■ (p) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SZQ ID NO: 97: 

AGCCCTGCTG AAGTACAGGT TCTTCTATCA GTTTCTGTTG GGCAATGAAC G.AGCAACAGC 60 

AAAGGAGATC AGGGATGAAT ATGTGGAGAC GCTGAGCAAG ATTTACCTGT CTTACTACCG ' 120 

CTCTTACCTG GGGCGGCTCA TGAAGGTGCA GTATGAGGAA GTCGCTGAGA AAGATGATCT 130 

AATGGGTGTG GAAGATACAG CAAAGAAAGG ATTCTYCTCA AAGCCATCGC TCCGCAGCAG . 240 

GAACACCATT TTCACCCTAG GAACCCGCGG CTCTGT CATC TCCCCCACTG A^CTTGAGGC 300 

CCCCATCCTG GTGC'CTCACA CAGCGCAGCG C^jGAGCAGA GGTATCCATT TGAGGCCCTC 360 

TTCCGCAGCC AGCACTACG3 CCTCCTAGAC AATTCCTGCC GCGAATACCT TTTCATCTGT 420 

GAATTTTTTG TTGTGTCTGG CCCAGYTCCA CACGACCTGT TCCATGCTGT CATGGGCCGT 430 

ACACTCAGCA TGACCCTGAA ACACCTGGAT TCTTATCTAG GTGACTGCTA CGATGCCATT 540 

GCTGTTTTTC TCTGTATCCA CATTGTTCTC CGGTTCCGTA ACATTGCAGC AAAGAGGGAT 600 
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GTTCCTGCCC TGGACAGGTA CTGGGGSACA G3TC-CTTC-" TTGCTATGGC CACGGTTTGA 
ACTGATCCTG GAGATGAATG TTC5GAGCGT CCGAAGCA.CT GACCCCCAGC GCCTAGGGGG 



GTTGGATACT CGGCCCCA.CT ATATCACACG CCGCTA.TGCA GAGTTCTCCT CCGCTCTTGT 
CAGTA.TCAAC CAGACAATTC CTAATGAACG GACCATGCAA TTGC7GGGAC AGCTGCAGGT 
GGAGGTG3AG AATTTTGTCC TCCGAGTGC-C AGCTGAGTTC TCCTCAAGGA AGGAGCAGCT 
TGTGTTTCTG ATCAACAACT ATGACATGAT GCTGGGTGTG CTGATGGAGC GGGCTGCAGA 
TGACAGCAAA GAGGTTGAGA GCTTCCAGCA GCTGCTCAAT GCTCGGACAC AGGAATTCA.T 
TGAAGAGTTG CTGTCTCCCC CTTTTGGGGG TTTAGTGGCA TT TG TGA&GG AGGCTGAGGC 
TTTGATTGAG CGTGGACAGG CTGAGCGACT TCGAGGGGAA GAAGCC CGGG TA^CTCAGCT 
GATCCGTGGC TTTGGTAGTT CCTGGAAATC ATCAGTGGAA TCTCTGAGTC AGGATGTAAT 
GCGGAGTTTC ACCAACTTCA GAAATGGCAC C.AGTATCATT CAGGGAGCGC TGACCCAGCT 
GATCCAGCTC TATCATCGCT TCCAZCGGGT GCTGTCCCAG CCGCAGCTCC GAGCCCTCCC 
TGCCCGGGCT GAGCTCATCA ACATTCACCA' CCTTATGGTG GAGCTCAAGA AGCATAAGCC 
CAACTTCTGA TGTGCCAGAA ACCGCCCTGA GATCTGCCGG TCATCTCCAT GGACTTCTGC 
ACCCCATTCC ATACCCTTCT TCACCTGGGG TACCCCTTCC AGTTTTCC CC TTGCTTCCCA 



rTTGAC ATGGCTTAGC TGCCTTCACT CCCAGCACCT TGCCCAACAG GATA-GCTGG 



ATCCCCTTGG CCTTCTGAAT ATCCCAGTGT CTTCAGGTTT CCCAAGACCA CTTCCCTGTG • 
GGCTTCCAAA ATGGCCTTTA TCATTTCTCC AGTCTGTCAC CCTCCTTTCC TGCTCCCATA 
CACC'CAAGGC TTGTTTCTTC CCCTGTAAAA ACCACTGCCT CAATCTCTGG TTC, a JCTCAAC 
TAGTCACCAT GTCCTGAGGC ATGAAGCCTC CTCAGCTCTT GGAATTGCTG GCAAGGGGTG 
ACTGCCTCTG AGTCATTGTG TTTTTCAAAG TGATTTCTTT TCTGTAGCTT TTTGACCTAA. 
GATGTCAGCA ATTTGAACAC TAACCTCTCC CCTCCTGGCT CA^SAATTAC TCCGAAGTCA 



GTCTGCAGAA AATAAATATT TAGTATGACA TGAAAAAAAA AAAAAAAAAA AAAAAAAAAA 
AAAAA 



560 
720 
730 
S40- 
900 
960 
1020 
IQ8Q 
1140 
1200 
1260 
1220 
1330 
1440 
1S0Q 
1560 
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1S30 
1740 
1300 
1360 
1920 
1930 
1335 
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(2) INFORMATION FOR SHQ ID MO: 93: 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 1415 base pairs 
.(3) TYPE: nucleic acid 

( C ) STRAWDEDNESS : 

(D) TOPOLOGY: 
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(xi) SEQUENCE DESCRIPTION : SEQ ID NO : 93: 
ATATGAAGGG AAAGAATTTG ATTATGTTTT CTCAATTC-AT ■ GTCAATGAAG GTGGACCATC 
ATAT An ATTG CCATATAATA CCAGTGATGA CCCTTGGTTA ACTGCATAC-. ACTTCTTACA 
GAAGAATGAT TTGAATCCTA TGTTTCTGGA TCAAGTAGCT AAATTTATTA TTGATAACAC 
AAAAGGTCAA ATGTTGGGAC TTGGGAATCC CAGCTTTTCA GATCCATTTA CAGGTGGTGG 
TCGGTATGTT CCGGGCTCTT CGGGATCTTC TAACACACTA CCCACAGCAG ATCCTTTTAC 
AGGTGCTGGT CGTTATGTAC CAGGTTCTGC AAGTATGGGA ACTACCATGG CCGGAGTTGA 
TCCATTTACA GGGAATAGTG CCTACCGATC AGCTGCATCT AAAACPATGA ATATTTATTT 
CCCTAAAAAA GAGGCTGTCA CATTTGAC CA AGCAAACCCT ACACAAATAT TAGGTAAACT 
GAAGGAACTT AATGGAACTG CACCTGAAGA GAAGAAGTTA ACTGAGGATG ACTTGATACT 
TCTTGAGAAG ATACTGTCTC TAATATGTAA TAGTTCTTCA GAAAAACCCA CAGTCCAGCA 
ACTTCAGATT TTGTGGAAAG CTATTAACTG TCCTGAAGAT ATTGtCTTTC CTGCACTTGA 
CATTCTTCGG TTGTCAATTA AACACCCCAG TGTGAATGAG AACTTCTGCA ATGAAAAGGA 
AGGGGCTCAG TTCAGCAGTC ATCTTATCAA' TCTTCTGAAC CCTA&AGGAA AGCCAGCAAA 
CCAGCTGCTT GCTCTCAGGA CTTTTTGCAA TTGTTTTGTT GGCCAGGCAG GACAAAAACT 
CATGATGTCC GAGAGGGAAT CACTGATGTC CCATGCAATA GAACTGAAAT CAGGGAGCAA 
TAAGAACATT CACATTGCTC TGGCTACATT GGCCCTGAAC TATTCTGTTT GTTTTCATAA 
AGACCATAAC ATTGAAGGGA AAGCCCAATG TTTGTCACTA ATTAGCACAA TCTTGGAAGT 
AGTAC^AGAC CTAGAAGCCA CTTTTAGACT TCTTGTGGCT CTTGGAACAC TTATCAGTGA 
TGATTCAAAT GCTGTACAAT TAGCCAAGTC TTTAGGTGTT GATTCTCAAA TAAAAAA3TA 
TTCCTCAGTA TCAGAACCAG CTAAAGTAAG TGAATGCTGT AGATTTATCC TAAATTTGCT 
GTAGCAGTGG GGAAGAGGGA CC-GATATTTT TAATTGATTA GTGTTTTTTT CCTCACATTT 
GACATGACTG ATAACAGATA ATTAAAAAAA GAGAATACGG" TGGATTAAGT AAAATTTTAC 
ATCTTGTAAA GTGGTGGGGA GGGGAAACA3 AAATAAAATT TTTGCACTGC TGAAAAAAAA 
AftAAAAAAAA AAAAGGAAAC TCGAGGGGGG GCCCGG 

(2) INFORMATION FOR SEQ ID NO: 99: 

(i) SZQUE^CZ CHARACTERISTICS : 

(A) LENGTH: 193 5 -base pairs 
(3) TYPE: nucleic acid 

(C) STRAMDEDMESo : double 

(D) TOPOLOGY: linear 
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(:ci) SEQUENCE DESCRIPTION: SEQ ID NO : 99: 

MTCTACCCTA ATCAAGATGG GGA.CATACTT CGCGACCAGG TTCTTCATGA ACATATCCAG 50 

AGATTGTCTA AAGTAGTGAC TGCAAATCAC AGAGCTCTTC AGATACCAGA. GGTTTATCTT 120 

CGAGAAGCAC CATGGC-CATC TQ2ACAATCA GAAA.TCAGGA CAATAAGTGC TTATAAAACC 130 

CCCCGGGA.CA AAGTGCAGTG CATCCTGAGA ATGTGCTCTA CGATTATGAA CCTCCTGAGC 240 

CTGGCCAATG AGGACTCTGT CCCTGGAGCG GATGACTTTG TTCCTGTGTT GGTGTTTGTG 300 

TTGATAAAGG CAAATCCACC CTGTTTGCTG TCTACTGTGC AGTATATCAG TAGCTTTTAT 360 

GCTAGCTGTC TGTCTGGAGA GGAGTCCTAT TGGTGGATGC AGTTCACAGC AGCAGTAGAA. 420 

TTCATTAAAA CCATCGATGA CCGAAAGTGA CGAAGACCAA GGCCCACCAA GGCAGCAGAC 430 

TGTTAATCAG ACAAACAGAT CTCTGAGAAG GTGCATCAGC TGCTTTGAAG GCTGAAGATT 540 

GTTTTGTATG ATACTGCAGA GCATCAGGCA TTTTAAAGCA. . GATCTTTACT AAACAGGTTA 600 

ATGAGCTAAG AAGCAGGTTC TCTCGTCTTT GGGCTCTTTC CTTTGTGAGT TGCATATTGT 660 

ATTTTCTTGT CCCCAA.GTAG AGACTAGTAG. TACAAAAAGG GACCACATTT TTCAAGTATT 720 

TCTAAGTATA AAAAACAAAA CAAAAA.TCTC TTAGGAAATG TCTAGACGTC CATTCTTGGA 730 

TTCCCTTTCT TTCCTTTTAT TTTAAAAAAG AACAGTACCC CTCTTTTAAG ATGCTGTCTT 840 

ACATTAATGA GCATCTAATG GAAAGAAGGT ATGAGTTGCA GTGAGGATTA GAATAGTGGT 900 

GGGTTAGTGG CATTATCTAT AAATACACTC ACCTAAATTG AAAGCTAAGA. AGGAAATGTA 960 
AATATAATAT ATATTTATAT TTGATGTAAT ATGGAGATCT GCAGATTCTA ATAAACAAGG • 1020 

ACTATTGCTG ATAGTAGGGT GTGACATAGT GTCTTGTGAA ATGGTTTCCT TGACAAAATT 1080 
TAAGCTGAGC TTAAAAGCAA. AAAAACAAAA AGTACACAGA. AATATTTATT AAAA.TGTAAT " • " 1140 

ACAGTTTATT GAACTTTCTA GGTATGGAGT TTGATGGACA GGGCTGCCTY TAATGAGTGT 1200 

GAAGGTCA.CT AAGTCACTTA GACATCTCAC CGTGGAAGTT TGTGAGCCTG CATTAGGAGA 1260 

TAGACTGATT ACCATACA.TG A.CATAAAAAG GAACAGTGGA TAGCTCATAC TTTATGGTGG 1320 
TTCTTCTCCT CCGAAATAAT ATACTGCAGA AATCCCAGAC AGAGCTCCTT ACAAACCTTT •: 1330 

AATTGTAATA TATTTTTGAT GA.TTATTCAC ATTGAATGCA. CAGACCAAGA ATTCAGTGAA 1440 

TGTCA.TTTTT TAAAAAACTA ATTTGTATTG TCTGCTCTAG TGATACAAGT TTTAGTAGTG 1500 

ATAAA.CTATT - TTAA.TCAAGC AT ACT ATT CT TATGGAAAAA AATATCTA1T TTGGCAGGTT 1560 

TCTGTGCCTT TA.TTTCCCTC TTCTGAAAAA AAGTCTGTGT TTTCATAGTT TGGTTTGCAT 1520 

TGTA.TATCAA TAA.TTAATCA GGAATGGGTT TTGGTGCCTG AAAAA.TTGGC CATGGAGGCA 1580 

CACCAAAGCT TCAAGCACAA GTCTTGTACA TGGGCCATCA CTGTCTGGTT TCA.CTTCGTG 1740 
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TGTTTCCTAA ACACATTTAG CTGCTTTTTT AAilAAACTCA GCCCCATACT TGAGTCCCTT 1300 

GTTGTTGGGA GCATTTCCAG GCATCTTTTA AGGGAACTGT GACAAACAGC CTCGGGCAGA 1360 

TGAACACGGA GGCTCTCTGT TGTCTGTCTC TGAGATCTTT GTGTCTGGGA ATGGCTAAAG 1920 

isriTTTG^rrrr ttttt L q 3 - 



(2) INFORMAT ION FOR SEQ ID NO: 100: 

15 (i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 599 base pairs- 
(3) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

20 

(xi) SEQUENCE DESCRIPTION: SEQ ID MO : 100: 
GAATTCGGCA CGAGCGTCCA CGCAGCCGCC GGCCGGCCAG C-\CCC.\GGCC CCTGCATGCC 60 
25 AGGTCGTTGG AGGTGGCAGC GAGACATGCA CCCGSCCC3G AAGCTCCTCA GCCTCCTCTT 120 
. CCTCATCCTG ATGGGCACTG AACTCACTCA AGACTCCGCT GCCCCCGACT CCCTGCTGAG 130 
AAGTTCAAAG GGCAGCACGA GGGGGTCTTT GGCTGCTATT GTCATCTGGA GGGGGAAGAG 240 

30 . ■ 

TGAGAGCCCG ATAGCCAAGA CCCGAGGCAT TTTCAGAGGT GGCGGGACCT TAGTCCTACC 300 

CCCAACACAC ACCCCTGAGT GGCTCATCCT CCCTTTGGGC ATAACGCTGC CCTTGGGGGC 360 

35 TCCAGAAACA GGCGGTGGGG ATTGTGCCGC TGAGACCTGG AA.GGGCAGCC AGCGTGCCGG 420 

CCAGCTGTGT GCATTGCTGG CTTAATATGC AGGGCTTGGG GGGCTGTGGC OCATGCCCG 430 

GCAGGAGGTG AGTGAGGAGC CCTGTGGCGT GCTGGTGTGG GGATCGTGGG CATTTCAAAC 540 

TACCCTGAAC AATGTATCAA TAGAGAAAAA AAAAAAAAAA AAAACTCGA ' 599 



(2) INFORMATION FOR SEQ ID MO: .101: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH : 734 base pairs 
(3) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 101: 

GAATTCGGCA CAGAAAAAAA AGAGAGACTG GGTCTTACTG TGTTGCCCAG ACTTGTCTTG 50 

AACTCCTGCC TCAGCCTCTC AAGTACTTGG GATTATAGGC CAAGAAGCCA CCATGCCTAG 120 

TCATTGATCC AGACTAA.TAC TCTGGGGTCA GCCTCATTTC TTCTCTTTCT 130 
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CACTTTGCAC ATCCACTTGT CACCAAATCK RGTTCATTCT GCATCCTAAG TAA.GTCCTTT 
GATTCCTCCA GTTGTTCATT AGTAATGTCT CAARTGTAAT TTTTTCTAGT A3TTTTCAGC 
CTGTCTTTCC KGCCTTCAGT CTTAACTTCT CCAGTACATA KGCCACATTG TTGTCAGCAcC 
GATCAWATTT TATTTAAAAA TACTTTACAW AKGTTTATKG CCAAATATTA GRA A ATACAG 
ATTCATGGAA AGAAAAATCA CTGTCCCAAG GAGGTCACTG GCATGGTGAG GTTAAGGGGT 
GA7TTTAATT TTTAAAAATG TATATTTTTT CCTGTGTAGA GTAGTAACAC CCTTGAAAAC 
ACAWTCCCTT GTAAAGTCTC TAATTCTGTA CTCCGCATCT AG STGRTCTC TTCTTTCTCA 
GATATTTTAC AATTTCATTT ATCACCACCT TTCTGTAGCC TTTACCCGTC TCTTCAATAT 
TWACATATGC AGAAGTTTCT CCTAACAAAC ACCTGCCTGT GCCTCAGTTC TGCTACCACC 
CTGTTGCTTT CTTTCCCTTC ACAATCAAAT TTAAGAGTGT CAA-AAAAAA AAAAAAAAAC 
TCGA 

(2) INFORMATION FOR SEQ ID NO: I'Q2 : 

(i) SEQUENCE CHARACTER! ST ICS : 

(A) LENGTH: 1035 base pairs 
(S) TYPE: nucleic acid 

(C) STRAWDEDNESS : double 

(D) TOPOLOGY: linear 

<:<i) SEQUENCE DESCRIPTION: SEQ ID NO: 102: 
AGAGGCCTGG CTGCGTTGCC CTATCTCCGT CTCCGCCACC CACTTAGCGT TTTAGGCATC 
AATTACCAGC AGTTTCTCCG CCACTATCTG G AAAATT AC C CGATTGCTCC CGGCAGAATA 
CAAGAGCTTG AAGAAOGCCG CAGTTGCGTG GAAGCCTGCA GAGC-A.GGGA AGCAGCGTTT 
' GATGCCGAAT ATCAGCGAAA TCCTCACAGG GTGGAC CTCG ATATTTTAAC CTTTACGATA 
GCTCTGACTG CCTCTGAAGT TATCAACCCT CTGATAGAAG AACTTGGTTG CGATAAGTTT 
ATCAATAGAG AATAGTTAGG TGGTGACACT ACTTCAAGAG AACCTCTGCA TTCCAGTCAT 
ACCAATCCTG CAACTTGATT TTCAGAAGTC AAGAGTATAT CGCGATAAGA CAGTGCACAG 
GTGGAGGGGA AAAAAAGGGG GAGGGGGAAG CTTATCTTGA AAAAGCATCA CAGAAGTAGA 
AAAAAATGTC GAAAGCATTA TAACTGTAAC GTTCTTTGAG TTTGTGATTG ATCCACATTT ' 
TTCCCCCTGC ATTATGGAAA ATGTCTCTCA GCATTGCTTT ATTACAAAGT AAAGGATGGT 
TTTATAAAAT TGAGACTGAT GAAACATCAA TACT AG AGC C CATGAGGATG AAAGAAATTA 
TCAAATAGTG CTGAACAGAA TAAGATGTTA ACGCTGAGTT ATTAGGACTG GA-.GGCTATG 
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AAAA3AACTT GAAATTOTCG GAATATGTGC TCTCTTCATG TCATATTCAA TAGAAGTTTC 730 

TAGTTTAAGA TTGATTTTGT GTTTTCTTAG GCATTTCAAG TGACAAGCAA AGTAAATGTA 840 

5 TATATTATGT G AT AAATC AT GTTTT C AAG A ACGTCAAATT TCTGGACTTT TTTCTTTCAA 900 

TTTTTAATTT TTAAAGTTTT TTTGGTATTA AAAAATCYAT TCAOAAGCCA A*AAATWTWT 960 

WAAATWTWCM GCGAAAAGCC AAAAAAAAAA AAAAWMAGGG GGC-C-CCGGGC CCCATCCCCC 1020 

10 

CAAGGGGGTC CNGNT 1Q35 



15 

(2) INFORMATION FOR .SEQ ID NO: 103: 

(1) SEQUENCE CHAPACTEP.I3TICS : 

(A) LENGTH: 2213 base pairs 
20 (B) TYPE: nucleic acid 

(C) STRANDEDNES3 : double 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 103: 

25 . 

AGGT ATT AGG CCCTTTTGTG GGAGCCCCAT GTTTTGTTTT TCTGAGTTGG TGGGGAGGGA 60 

SGGAGGGGGA GGGCTGAATT GTTTTGCAGA GGAAGATCGC ATCTGTGCTT TAAATTTCTC 120 

30 ATTACTGGGT TAGAAAACAA AGAGGGAKTG CCCTGCAG AT , -TTTGTTTTGT GCTTTTAAAT 130 

GTTTCTTAAG TTGGAACAGG TTTCCTCGGG CCTGTTTTGA CTGATTGCTG GAGTGCATTT 240 

GATAGTTAAA AATTACTAAT TGGTTTTATT TCCCTTCACA CTGTGGGTGG CCACTTCTCC 300 

35 

CCCCGTTACT GAAAAATAAC CATTTTAGTG TCAGGCTAGA AATTGAATTG GTGAGTTTTG 3 60 

TGTATCCTTT AAATTAAAAA CCACAAGTGT TTATTGTAGT GGTTAAACTG TAGCATCTCA 420 

40 GCATGTGGGT GGAAGCTGCC TATATTTCTT CCCAGTTTAA CTGGGGAGCA TCTGTGAAAT 430 

TAATTTTCCA TCCAGACAGG TGCTGTGAGG AAATGAAGAT AAATGCTCGC TGGAAATTTA 540 

CTAACCAGTT TTTATATTGA- CCTGCACTQT AAAAAGCAGA TTTAATTATA AACAATATAT 500 

45 

TCAAAATGGG CAAATTTTAT TTTCAAATGG AGTGTAGAGC TAGATTAAAA GCAACTCTTT 660 

GCCACCTACT CTGCCCTTTT GGCAAAGTTA CCTTGAACAA AGAATCTTAA GGGTTTATTA 720 

50 AGAACTGTTT ATTTTCTTCA TACGGTGTTC TCTGCAGTGC TTTCTAACAG CTTCTGGGTG 730 

CAGATTTTCT TCGGCATCCT TTTGGACTCA GCTTATTACA GGTAGGTAGT GCTTAAGAAA 340 

AGTCATGGAG GACTAAAGCG TAAGTCCTTT TCACTTTTGC TCCATCTGAA GGTAGGTGAG 900 

55 

TTCATCCTCT TCATAGTAAT GCTGTTTTAC CAAGACTTTA TAGCAGATGG AGCGAGAAAG 960 

AATTTTCTGC TATTGTGTTC ACTACAACAG GATAGGGACA TCAGACAGCC CCAGAAACCC 1020. 

60 CTTCCAGATC TGATATGGGA CTATTAATTT TTATGCTGTT AATTGGTATT CATTCACAAT 1080 
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GCAGTTGAAG GGGGAAGGCT 
TGTAAGAXCT ATACTCGA3G 
GTTTCTGAGG C-GTTCTGAAA 
AAGCTGAAAT ATATGCATGT 
CTTAACTTTA CTTCTCTTTT 
ATGGTAGGCA CAGAAGAAAC 
CCAACCCCAA ATTTGTCTAA 
AATTCTGGGT TTTTTTCTTC 
CTCTCTCGCA GCTCTTGAAA 
TATTTGGATT GCTTGCTTCC 
AATTCCAAAC CTCAAAAACT 
TCTTCTGCCT CCATGTCTGT 
GCAGAAGAAT CGTTTTATGC 
GGTTCCAATG TATTAAGCAA 
TGGACTTCTC ATCTAAAAGG 
TCTTGCTGGT CTCCAGACAC 
GATGTTCAGG CACAGGATGC 
TTGATACCAT CATCTTGTTT 
AAAAAAAAAA AAAAAAAAAA 



CCAGTGCATT CTTTGGCTAA GGCCTGAATG CTTGCTCATC 
TTTTGTTTTC CTTTTAAAAT TCTTTAGGGA GAGAGGGATG 
GTATGATTCA ATGTGCAACA TACAGGTAGG TCTTCAGCAT 
AAAAACTTTG ACATCTTTTT TTTTAATTTT CCACTTTCTT 
TGTCCCGCCC CCATCTTACA GAAGTTGAGG CCAAGGGAGA 
ATGGCAAAGT GCTCTGTGCT TTCAAACCAA AGTGTTCCCC 
GGACTGGCCA GTCTGTTGTG GGCATTGTTT TCTACAACCA 
TTTCTTTAAA CATAGAGGTA CCACCACAAG GGATGCCCTA 
GCATCTGTTT GAGGGAAAGG TCTCTGGGCA AGCAAGTGGT 
CTTTTTCCAC CTGGGACATT G" ^ AATCATAA AATAACAGTA 
ATTATGGCCT GAGCACAGCT GAAA.TCTAGC AGAGTTTAAC 
CACTTATAAT TCAGGTTCTG CTGTTGGCTT CAGAACATGA 
TAGTTATTGC A1TCATGGTT GAAACTCAAC TTAGGGAAAG 
TGGGCTGGTT CTCCCCAATC CTCCCTAACA ATTCGTTGTG 
TTAGTGGCTT TTGCTTGGGA TCAGTGCTCT CTATTGATGT 
ATTCCTGTTG CATTAAGACT TGAAAGACTT GTAGATGTGT 
TGAAAGCTAT GTTACTATTC TTAGTTTGTA AATTGTCCTT 
TCTTTTTGTA GGTATAAATA AAAACACTGT TGACAATAAA 
AAAAAAAAAA AAAAAAAAAA AAAAAAAAAA AnAAAAAA 
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(2) INFORMATION FOR SEQ ID MO : 104: 

(i) SEQUENCE CHARACTERISTICS : 

(A) . LENGTH: 13 51 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDED WES 3 : double 

(D) TOPOJXGY: linear 

(Xi) SEQUENCE DESCRIPTION : SEQ ID NO: 104: 
CTTCACAGAC TGACAGPA.TG GTTTTGTTTT O'l'T ' l'l ^ 'i'l T L' G T TT T G TTT T GTTTTTGAGA 
TGGACT CT AG CTCTGTCAIC CA3GCTGGAG TGCAGTGGTG CGATCTCGGC TCACTGCAAG 
CTCCGCCTCC CGGGTTCTCA CCATTCTCCT GCCTCAGCCT CCCGAGTAGC TGGGACTACA 
GGCGCCCACC ACCACGCCCG GCTAATTTTT TGTATTTTTT AGTAGAGACG C-GGTTTCACC 



60 
120 
130 
240 



10 
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ATGTTAGCCA GGATC-GTCTC GATCTCCTGA CCTCGTGATC CGCCCGCYTC GGCCTCCCAA 3QG 

AGTGCTGGGA TTACAGGCGT GAGCCACCGT GCCTGCCGCA GAATGGTTTT TAAAGCCACA 360 

> GTTGAGARGC CACCCATTGC CCGGCGCCTG GACAGTGATC ATGTTGTTCA TCTTGTTCAG -420 

TCCTTTCTTG TGTGATTGGA ATTATTCATC CCCTTTGAAA GATGAGAAGG TTGAGATGCA 430 

AAGAGTCTAC CTTTCCAAGT TCTCACTGGT GGAAAGAF.CT AGAAGCA.CAG TTCAAAGTTC ' 540 

TGGNTTCTGG ACT CTGCAGT CGnGGTYTCC CTTYTCCCAC TTGCCTACCC TCAATGCCAC 500 

ACTGTTTTTG AAGTGGCCCA TAAGTTGAAG GHAAAGTTTA AAjGAGAGTTC AATTTAATCA 560 

15 TCAGHA.TGCA TTCTTTTTTT TTTCGGAHAC GGAKTTTCAC TCTTGCTGCC CA3GCTGGAG 720 

TGCAATGGTG CAATGATCTC GGCTCACTGC AACCTATGCC TCCTGGGTTC AAGNGATTAT 730 

CCAGCCTCAG CCTCCOGAGT AGCTGGGATT ATGGGCGCCC ACCACCATGC CCA.GCTA.iTT 340 

20 

TTTGTATTTT TTTTTTTAGT AGAGATGGGG TTTCGCCAGG TTGGCCAGGC TGKTCTTGTG 900 

AAYTCCTGGC YTCAGGTGAT YTGCCCACYT CATCTTCCAA AA.GTGCTGGG ATTACAGGCA 960 

25 TGAGCCACTG CGCCTGGCYT CAGAA.TGCAT TCTTACACAT CTATCCTAGA CATTTATAAG 1020 
CACTCTAATG GATAACAATC CAAGAATAAA TGATTGTAAA AGATGATGCC GAAGAGTTGA ' 1030 

TGTCAATCTT TTTTTCCTAA GAAAAAAAGT CCGCGAGTAT TAAATA.TTTA GATCAATGTT 1140 

30 

TATAAAATGA TTACTTTGTA TATCTCATTA. TTCCTATTTT GGAATAAAAA CTGACCTTCT 1200 

TTAATCATAT ACTTGTCTTT TGTAA^TAGC AGCTTTTGTG TCATTCTCCC C^CTTTATTA 12 a 0 

35 ' GTTAATTTAA ATTGGAAAAA ACCCTCAAAC TAATATTCTT GTCTGTTCCA GTCTTATAAA 1320 

TAAAACTTAT AATGCATGTA. AAAAAAAAAA A 1351 



40 



(2) INFORMATION FOR SEQ ID MO: 105: 



(i) SEQUENCE CHARACTERISTICS : , 
45 (A) LENGTH: 20 o 5 base pairs 

(3) TYPE: nucleic acid 
(CJ STRANDEDNESSr- doubla 
(D) TOPOLOGY: linear 

50 Ui) SEQUENCE DESCRIPTION: SEQ ID NO: 105: 

GGCACGAGGC GGCGGAGGGC CACAATCACA GCTCCGGGCA TTGGGGGr AC CCGAGCCGGC 60 

TGCGCCGGGG GAATCCGTGC GGGCGCCTTC CGTCCCGGTC CCATCCTCGC CGCGCTCCAG 120 

55 

CACCTCTGAA GTTTTGCAGC GCCCAGAAAG GAGGCGAGGA AGGAGGGACT GTGTGAGAGG 130 
ACGGAGCAAA AACCTCACCC TAAAACATTT ATTTCAAGGA GAAAAGAAAA AGGGGGGGCG - ' 240 

60 CAAAAATGGC TGGGGCAATT ATAGAAAACA TGAGCACCAA GAAGCTGTGC ATTGTTGGTG 30 Q 
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GGATTCTGCT CGTGTTCCAA ATCATCGCCT TTCTGGTGGG AGGCTTGATT GCTCCAGGGC 360 

CCACAACGGC AGTGTCCTAC ATGTCGGTGA AATGTGTGGA TGCCCGTAAG AACCATCACA 420 

AG-iCA-^AATG GTTCGTGCCT TGGGGACCGA ATGATTGTGA C^JGATCCGA GACATTGAAG _ 430 

AGGGAAITCC A-CGGAA--TT GAAGCG- ATG ACATCGTGTT 'TTCTGTTCAC ATTCCCCTOC 54G 

CCCACATGGA GATGAGTCCT TGGTTCCAA.T TCATGCTGTT TATCCTGG^G CTGGACATTG 500 

CCTTCAAGCT AAACAACCAA ATGAGAGAAA ATGCAGAAGT CTCCATGGAC GTTTC CCTGG 660 

CTTACCGTGA TGACGCATTT GCTGAGTGGA CTGAAATGGC CCATGAAAGA GTACCACGGA 72Q 

AACTCAAATG CACCTTCACA TCTCCCAAGA CTCCAGAGCA TGAGGGCCGT TACTATGAAT 730 

GTGATGTCCT TCCTTTCATG GAAATTGGGT CTGTGGCCCA TAAGTTTTAC CTTTTAAACA 340 

TCCGGCTGCC TGTGAATGAG AAGAAGAAAA. TCAATGTGGG AATTGGGGAG ATAAAGGATA 900 

TCCGGTTGGT GGGGATCGAC CAAAATGGAG GCTTCACCAA GGTGTGGTTT GCCATGAAGA 960 

CCTTCCTTAC GCCCAGCATC TTCATCATTA TGGTGTGGTA TTGGAGGAGG ATCACCATGA. 1Q20 

TGTCCCGACC CCCftGTGCTT CTGGAAAAAG t TCATCTTTGC CCTTGGGATT TCCA.TGACCT 1080 

TTATCAATAT CCCAGTGGAA TGGTTTTCCA TCGGGTTTGA CTGGACCTGG ATGCTGCTGT 1140 

TTGGTGACAT CCGACAGGGC ATCTTCTATG CGATGCTTCT GTCCTTCTGG ATCATCTTCT 1200 

GTGGCGAGCA CATGATGGAT CA3CACGAGC GGAACCACAT TGCAGGGTAT TGGAAGCAAG 1250 

TCGGACCGAT TGGGGTTGGC TGCTTCTGCC TCTTCATATT TGAOVTGTGT GAGAGAGGGG 13 2 Q 

TACAACTCAC GAATGGCTTC TACAGTATCT GGACTACAGA CATTGGAACA GAGCTGGCCA 1330 

TGGCCTTCAT CATC GTGGC T GGAATCTGCC TCTGCCTCTA CTTCCTGTTT CTATGCTTCA 144Q 

TGGTATTTCA GGTGTTTCGG A-C AT CAGTG GGAAGCA3TC CAGCCTGCGA GCTATGAGCA 1500 

AAGTCCGGCG GCTACACTAT GAGGGGCTAA TTTTTAGGTT CAAGTTCCTC ATGCTT AT CA 15 50 

CCTTGGCCTG CGCTGCCATG ACTGTCATCT TCTTCATCGT TAGTCAGGTA ACGGAAGGCC 1620 

ATTGGAAATG GGGCGGCGTC ACAGTCCAAG TGAACAGTGC CTTTTTCACA GGCATCTATG 1680 ' 

GGATGTGGAA TCTGTATGTC TTTGCTCTGA TGTTCTTGTA TGCACGA.TGC CATAAAAACT 1740 

ATGGAGAAGA CCAGTCCAAT GGAATGCAAC TCCCATGTAA ATCGAGGGAA GATTGTGCTT 13 OQ 

TGTTTGTTTC GGAACTTTAT C^AGAATTGT TCAGCGCTTC GAAA.TATTCC TTCATCAATG 1360 

ACAACGCAGC TTCTGGTATT TGAGTCAACA AGGCAACACA TGTTTATCAG CTTTGCATTT 1920 

GCAGTTGTCA GAGTCACATT GATTGTACTT GTATACGCAG ACAAATAGAC TCATTTAGCC 1980 

. TTT ATCTCAA AATGTTAAAX ATAAGGAAAA AAGCGTCAAC AATAAATATT CTTGAGTATA " 2040 

AAAAAAAAAA. AAAAAAAAAA AAAAAA 2066 
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D (2) INFORMATION FOR SEQ ID NO: 106: 

(i) SEQUENCE C-SPACTSRISTZCS : 

(A) LENGTH: 1705 base pairs 
(3) TYPE : nucleic acid 
10 (C) STRANDEDMESS : double 

(D) TOPOLOGY : linear 

(:ci) SEQUENCE DESCRIPTION : SEQ ID NO: 106: 

15 AATTCGGCAK AGGGC AGCTC TCGGCTGGAA GGAACTGGTC TGCTCACACT TGCTGGCTTG 60^ 

CGCATCAGGA CTGGCTTTAT CTCCTGACTC ACGGTGCAAA GGTGC ACT CT GCGAACGTTA 120 

AGTCCGTCCG CAGCC-CTTGG AATCCTACGG CCCOCACAGC CGGATCCOCT CAGCCTTCCA 130 
20 . ' . 

GGTCCTCAAC TCCCGYGGAC GCTGAACSAT GGCCTCCATG GGGCTACAGG TAATGGGCAT 240 

CGCGCTGGCC CTCCTGGGCT GGCTGGCCGT CATGCTGTGC TGCGCGCTGC CCATGTGGCG 300 

25 " CGTGACGGCC TTCATCGGCA GCAACATTGT CACCTCGCAG ACCATCTGGG AGGGC CTATG 360 

GATGAACTGC GTGGTGCAGA GCACCGGCCA 'GATGCAGTGC AAGGTGTACG ACTCGCTGCT 420 

GGCACTGCCG • CAGGACCTGC AGGCGGCCCG CGCCCTCGTC ATCATCAGCA TCATC GTGGC 430 

30 

TGCTCTGGGC GTGCTGCTGT CCGTGGTGGG GGGCAAGTGT ACCAACTGCC TGGAGGATGA 540 

AAGCGCCAAG GCCAAGACCA TGATCGTGGC GGGCGTGGTG TTCCTGTTGG CCGGCCTTAT ' 500 

35 GGTGATAGTG CCGGTGTCCT GGACGGCCCA CAACATCATC CAAjGACTTCT AC AA.TC CGCT 660 
GGTGGCCTCC GGGCAGAAGC GGGAGATGGG TGCCTCGCTC "TACGTOGGCT GGGCCGCCTC . 720 

CGGMCTGCTG CTCCTTGGCG GGGGGCTGCT TTGCTGCAAC TGTCCACCCC GC ACAGACAA 7S0 

40 

GCCTTACTCC GCCAAGTATT CTGCTGCCCG CTCTGCTGCT GCCAGCAACT ACGTGTAAGG 340 

TGCCACGGCT CCACTCTGTT CCTCTCTGCT TTGTTCTTCC CTGGACTGAG CTCAGCGCAG ' 900 

45 GCTGTGAC C C CAGGAGGGCC CTGCCAOGGG CCACTGGCTG CTGGGGACTO GGGACTGGGC 950 

AGAGACTGAG CCAGGCAGGA 'AGGCAGCAGC CTTCAGCCTC TCTGGCCCAC TCGGACAACT 1020 

TCCCAAGGCC GCCTCCTGCT AGCAAGAACA GAGTCCACCC TCCTCTGGAT ATTGGGGAGG 1090 

GACGGAAGTG ACAGGGTGTG GTGGTGGAGT GGGGAGCTGG CTTCTGCTGG CCAGGATGGC - 1140 

rtGA CTTTGGGATC TGCCTGCATC GGTGTTGGCC ACTGTCCCCA TTTACATTTT 1200 



50 



55 CCCCACTCTG TCTGC CTGCA TCTCCTCTGT TGCGGGTAGG CCTTGATATC AC CTCTGGGA • 1260 
CTGTGCCTTG CTCACC-GAAA CCCGCGCCCA GGAGTATGGC TGAGGCCTTG CCCACCCACC 1320 
TGCCTGGGAA GTGCAGAGTG GATGGACGGG TTTAGAGGGG AGGGGC'GAAG GTGCTGTAAA 1330 
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CAGGTTTGGG CAGTGGTGGG GCAGGGGCCC AGAGAGGCGG CTCAGGTTGC CCAGCTCTGT ' 1440 

GGCCTC3GGA' CTCTCTGCCT CACCCGCTTC AGCCC^GGGC CCCTGGAGAC TC-ATCCCCTC 1500 

TGAGTCCTCT GCCCCTTCCA AGGACACTAA. TGAGCCTGGG AGGGTGGCAG GGAGGAGGGG 1560 

ACAGCTTCAC CCTTGGAAGT CCTGGGGTTT TTCCTCTTCC TTCTTTGTGG TTTCTGTTTT 1520 

GTA ATTTAAG AAGAGCTATT CATCA.CTGTA ATTATTATTA TTTTCTACAA TAAATGGGAG 1530 

CTGTGCACAG GRAAAAAAAA. AAAAG 1705 

(2) INFGRMAT I CM FOR SEQ ZD MO: 107: 



(i) SEQUENCE CKAHACTEF.I3TICS : 

(A) LENGTH; 1167 base pai-rs 
20 (3) TYPE: nucleic acid 

(C) STRANDEDNE33 : • double 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 107: 

TGCAGGAATT CGGGAGAGGT TTTCCGCTAG ACTCTGGGAG TTGGTGAGCA TCATGGCAAC SO 

CGTTACAGGG ACAACCAAAG TCCCGGAGAT CCGTGATGTA ACAAGGATTG AGCGAATCGG- 12 Q 

30- TGCCCACTCC CACATCCGGG GACTCGGGCT GGACGATGCC TTGGAGCCTC' GGCAGGCTTC 130 

GCAAGGCATG GTGGGTCAGC TQGCGGCACG GCGGGCGGCT GGCGTGGTGC TGGAGATGAT 240 

CCGGGAAGGG AAGATTGCCG GTCGGGCAGT CCTTATTGCT GGCCAGCCGG GCACGGGGAA 300 

35 

GA.CGGCCA.TC GCCATGGGCA TG3CGCMGC " CCIGGGCCCT GACACGCCAT TCACAGCCAT 350 

CGCCGGCAGT GAAATCTTCT CCCTGGAGAT GAGCAAGACC GAGGCGCTGA GGGAGGCCTT 420 

40 COGGCGGTCC ATCGGCGTTC GCATCAAGGA GGAGACGGAG ATCA.TCGAAG GGGAGGTGGT 4S0 

GGAGATCCAG ATTGATCGAC CAGCAACAGG GACGGGCTCC AAGGTGGGCA AACTGACCCT 540 

CAAGACCACA GAGATGGAGA CCATCTACGA CCTGGGCACC AAGATGATTG AKTCCCTGAC SQ0 

45 

CAAGGACAAG GTCCAGGCCG GGGACGTGAT CACCATCGAC AAGGCGACGG GCAAGATCTC 560 

. CAAGCTGGGC CGCTCCTTCA CACGCGCCCG -CGAACTACGA CGCTATGGGC TCCCAGACCA 720 

50 AGTTCGTGCA GTGCCCAGAT GGGGAGCTCC AGAAACGCAA GGAGGTGGTG CACACCGTGT 730 

CCCTGCACGA GATCGACGTC ATCAACTCTC GCACCCAGGG CTTOCTGGCG CTCTTCTCAG 340 

GTGACACAGG GGAGATCAAG TCAGAAGTCC GTGAGCAGAT CAATGGCAAG GTGGCTGAGT 900 

55 - 

GGCGCGAGGA GGGCAAGGCG GAGATCATCC CTGGAGTGCT GTTCATCGAC GAGGTC CACA 960 

TGCTGGACAT CGAGAGCTTC TCCTTCCTCA ACCGGGCCCT GGAGAGTGAC ATGGC-GCCTG 1020 

60 TCCAGCAGGT CTATGGGGAT GCCGTGAGGG CTCTGGTAGC TGGTGCCCCG GATTCGCGTG 1080 
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ATGCCACGGT TQGTGGCCTC GTGCCGAATT CCTGCAGCCC GGGGGATCCA CTAGTTCTAG 1140 
AGCGGCCGCC ACCGCGGTGG ANCTCCN 1*67 



(2) INFORMATION FOR SEQ ID WO: 103: 

(IT SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 1907 base pairs 
(3) TY?E : nucleic acid 

(C) STRANDEDNES3 : double 

(D) TOPOLOGY: - linear 

(:ci) SEQUENCE. DESCRIPTION: SEQ ID MO: 103: 



GGCACAGGGG AATCATCGTG TGATGTGTGT GCTGCCTTTG TGAGTGTGTG GAGTCCTGCT ' 5Q 

CAGG7GTTAG GTACAGTGTG TTTGATCGTG GTGGCTTGAG GGGAACCCTT GTTCAGAGCT 120 

. GTGACTGCGG CTGCACTCAG AGAAGCTGCC CTTGGCTGCT CGTAGCGCCG GGCCTTCTCT 130 

■CCTCGTCATC ATCCAGAGCA GCCAGTGTCC GGGAGGCAGA AGGTACCGGG GCAGCTACTG ' 240 

GAGGACTGTG CGGGCCTGCC TGGGCTGCCC CCTCCGCCGT . GGGGCCCTGT TGCTGCTGTC 300' 

CATCTATTTC TACTACTCCC TCCCAAATGC ' GGTCGGCCCG CCCTTCACTT GGATGCTTGC 3 60 

CCTCCTGGGC CTCTCGCAGG CACTGAACAT CCTCCTGGGC CTCAAGGGCC TGGCCCCAGC 420 

TGAGATCTCT GCAGTGTGTG AAAAAGGGAA TTTCAACGTG GCCCATGGGC TGGCATGGTC 430 

ATATTACATC GGATATCTGC GGCTGATCCT GCCAGAGCTC CAGGCCCGGA TTCGAACTTA 540 

CAATCAGCAT TACAACA.ACC TGCTACGGGG TGCAGTGAGC CAGCGGCTGT ATATTCTCCT 500 

CCCATTGGAC TGTGGGGTGC CTGATAACCT GAGTATGGCT GA.CCCCAACA TTCGCTTCCT 660 

GGATAAACTG CCCCAGCAGA CCGGTGACCG TGCTGGCATC AAGGATCGGG TTTACAGCAA 720 

'CAGCATCTAT- GAGCTTCTGG AGAACGGGCA . GCGGGCGGGC ACCTGTGTCC TGGAGTACGC 730 

CACCCCCTTG CAGACTTTGT TTGCCATGTC ACAATACAGT CAAGCTGGCT TTAGCGGGGA 340 

GGATAGGCTT GAGCAGGCCA AACTCTTCTG CCGGACACTT GAGGACA.TCC TGGCAGATGC 900 

CCCTGAGTCT CAGAACAACT GCCGCCTCAT TGCCTACCAG GAACCTGCAG ATGACAGCAG .' 960 

CTTCTCGCTG TCCCAGGAGG TTCTCCGGCA CCTGCGGCAG GAGGAAAAGG AAGAGGTTAC 102 Q 

TGTGGGCAGC TTGAAGACCT CAGCGGTGCC CAGTACCTCC ACGATGTCCC AAGAGCCTGA 1080 

GCTCCTCATC AGTGGAATGG AAAAGCCCCT CCCTCTCCGC ACGGATTTCT CTTGAGACGC 1140 

AGGGTCACCA GGCCAGAGCC TCCAGTGGTC TCCAAGCCTC TGGACTGGGG GCTCTCTTCA 120Q 

GTGGCTGAAT GTCCAGCAGA GCTATTTCGT TCCACAGGGG GCCTTGCAGG GAAGGGTCCA 12*60 
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GGACTTGACA TCTTAAGATG CGTCTTGTCC CCTTGGGCCA GTCATTTCCC CTCTCTGAGC 1320 
CTCGGTGTCT TCAACCTGTG AAATGGGATC ATAATCACTG CCTTACCTCC CTCACGGTTG * 1380 

TTGTGAGGAC TGAGTGTGTG GAAGTTTTTC ATAAACTTTG GATGCTAGTG TACTTAGGGG 1440 

GTGTGCCAGG TGT CTTTCAT GGGGCCTTCC AGACCCACTC CCCACCCTTC TCGCCTTCCT 1500 

TTGCCCGGGG ACGCCGAACT CTCTCAATGG TAT CAACA.GG CTCGTTCGCC CTCTGGCTCC IS 60 

TGGTCATGTT CCATTATTGG GGAGCCCCAG CAGAAGAATG GAGAGGAGGA GGAGGCTGAG • 1620 

TTTGGGGTAT TGAATGCCGC GGCTCCCACC CTGCAGCATC AAGGTTGCTA TGGACTCTCC 1530 

15 TGCCGGGCAA CTCTTGCGTA ATCATGAGTA TCTCTAGGAT TCTGGCACCA CTTCCTTCCC 1740 

TGGCCCCTTA AGCCTAGCTG TGTATCGGCA CCCCCACCCC ACTAGAGTAC TCCCTCTCAC 1300 

TTGCGGTTTC CTTATACTCC ACCCCTTTCT CAACGGTCCT TTTTTAAAGC ACATCTCAGA 1360 

TTAAAAAAAA AAAAAAAAAA AAAAAAAAAA AAAAAAAGGG CGGCCGC 1907 



10 



20 



30 



3d 



25 

(2) INFORMATION FOR SEQ ID NO: 109;. 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 611 base pairs 
(3) TYPE: nucleic acid 
•(C) STRAMDEDMES3 : double 
"(D) TOPOLOGY; linear 

(:<i) SEQUENCE DESCRIPTION: SEQ ID NO: 105: 

ATGAATTAAC GCCAAGCTNT NAATAGGGAC TCACTATGGG GGAAAGOTGG GTAACGCCTG' 50 

CAGGTACCGT TCCGGAATTC CCGGGTCGAC ' CCACGCGTCC GATGGGGCTT . TAGTAAATCA 120 

40 - GGCTTGCAGG CTCAAAGCTG CAATCTGCCC ACTCTCAGGT ACTGAGACTT TGTGGGCCTC , 130 

AGACACCAGG AAGAAAGTTG GGATACAGTC ATTTGAGTTA AAAAGGGAAT GACCCCTCAG • 240 

AAACCCGCA.T TAGCAGTGTT -ACTCTTGGAA GTGCCTTTAC TTTTAACGCT CTCTGTTCTG 300 

45 

AAAAAGAGGT GTTTGGTTAC GTGTGAGCCA ACATCACGTT TTGTTAGCTG TGATTTACCT 360 
TTGTCCGTTT AAAAGACTTC ACGGAGCCAT TCTGTATACA AGGTGTGCTC TTTCCAATGT .. 420 

50 AGAAGGGGTT ATGGAAAAGG GTGCGATCCT TTGCTGTAAA CTGGAGAGAC CAGTCCCAAA 430 

CAGAGGGGAA TTTTAAGCCC TTCTCATCAC CCAATTGGAT GTTTTTGCTT ATAGCAAATT S40 

CCTGCAAAAT AAATAAATAA ATATTTGCAA AACTAAAAAA AAAAAAAAAA AAAAAAAAAA 600 

GGGGGGNCCN C £* 1 
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40 



45 



50- 



20 



60 



(2) INFORMATION FOR 3EQ ID MO: 1 10 : 

(i) SZQUEXCE CrL^ACTERISTICS : 

(A) LENGTH: 2532 base pairs 

(B) TY?E: nucleic acid 

( C ) ST?A*JDEDNES 3 : dc ub 1 e 

(D) TOPCLCGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID MO: LiO : 

TCCCAGCTCT CAGGACAAGG GCCCTGGGCG ATCTTTTAAA AAAGCCGATT GGGTGTCTTT - 60 

CTAAAANTAC AACCAGTACT TCATCGTCAA GTTTCTCGGA AGGGAGTCCC CTCCAGATTC 120 

TCATGGAGTG ACAAATCTTG ACTCTTGCTC CTGGAATTTT TCAGGCCCAA ACTAGCGTTT 130 

CTACAATGAT TTATTTGGCA AATTTGTCTT GATTATGGGT GGCTGATGAG GAA.CGTGCTT 2 40 

TTGTTAGGAA CCGAAACTGG GCGGCGGTGA GGGCGTGTAC GCAATGAGTC CGC-AAGAGGG 300 

TGAAATGCTT TCGGTAGGCA CTC-CACGGCT GTGAAGATGG CGGCGGCTGC GTGGCTTCAG -. 2 50 

GTGTTGCCTG TCATTCTTCT GCTTCTGGGA GCTCACCCGT CACCACTGTC GTTTTTCAGT 420 

GCGGGACCC-G CAACCGTAGC TCCTGCCGAC CGGTCCAAAT GGCAGATTCC GATACCGTCG 430 

GGGAAAAATT ATTTTAGTTT TGGAAAGATC CTCTTCAGAA ATACCACTAT CTTCCTGAAG 540 

TTTGATGGAG AACCTTGTGA CCTGTCTTTG AATATAACCT GGTATCTGAA AAGCGCTGAT 500 

TGTTACAATG AAATCTATAA CTTCAAGGCA GAAGAAGTAG AGTTGTATTT GGAAAAACTT 650 

AAGGAAAAAA GAGGCTTGTC TGGGAAATAT CAAAC.ATCAT CAAAATTGTT COjGAACTGC 720 

AGTGAACTCT TTAAAACACA GACCJTTTCT GGAGATTTTA TGCATCGACT GCCTCTTTTA 730 

GGAGAAAAA.C AGGAGGCTAA C<^AGAATGGA ACAAACCTTA CCTTTATTGG AGACAAAACC 340 

GCAA.TGCATG AACCATTGCA AACTTGGCAA GATGCACCAT ACATTTTTAT TGT AC AT ATT 900 

GGCATTTCAT CCTCAAACGA ATCATCAAAA GAAAATTCAC TGAGTAATCT TTTTACCATG 960 

ACTGTTGAAG TGAAGGGTCC CTATGAATAC CTCACACTTG AAGACTATCC CTTGATGATT 1020 

TTTTTCATGG TGATGTGTAT TGTATATGTC CTGTTTGGTG TTCTGTGGCT GGCATGGTCT . 1030 

GCCTGCTACT GGAGAGATCT CCTGAGAATT CAGTTTTGGA TTGGTGCTGT CATCTTCCTG 1140 

GGAATGCTTG AGAAAGCTGT CTTCTATGCG GAATTTCAGA ATATCCGATA CAJ^AGGARAA 1200 

TCTGTCCAGG GTGCTTTGAT CCTTGCAGAH CTGCTTTCAG CAGTGAAA.CG CTCACTGGCT 1260 

CGAACCCTGG TCATCATAGT CAGTCTGGGA TATGGCATCG TCAA.GCCACG CCTGGAGTCA 1320 

CTCTTCATAA GGTTGTAGTA GCAGRAGCCC TCTATCTTTT GTTCTCTGGC ATGGAAGGGG 1330 

TCCTCAGAGT TACTGGGGCC CAGACTGATC TTGCTTCCTT GGCCTTTATC CCCTTGGCTT 1440 

TCCTAGACAC TGCCTTGTGC TGGTGGATAT TTATTAGCCT GACTCAAACA ATGAAGCTAT 15010 
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TAAAACTTCG GAGGAACATT GTAAAACTCT CTTTGTATC3 C<LATTTCACC A-.CACGCTTA IS 60 

TTTTGGCAGT CGCAGCAJTCC ATTGTGTTTA TCATG7GGAC AACCATGAAG TTCAGAATAG 1520 • 

TGACATGTCA GTGGGACTGG CGGGAGCTGT GGGTAGACGA TGCCATCTGG CGCTTGGTGT 1630 

TCTCCATGAT CCT CTTT GTC ATCATGGTTC TCTGGCGACC ATCTGCAAAC AAC C\G AGGT 1740 
TTGCCTTTTC ACGATTGTCT GAGGAAGAGG AGGAGGATGA ACAAAAGGAG CCTATGCTGA ' 1300 

AAGAAAGCTT TGAAGGAATG AAAATGAGAA GTACCAAACA AGAACGCAAT GGAAATAGTA 1360 

AAGTTAACAA AGCACAGG^A GATGATTTGA AGTGGGTAGA AGAGAATGTT CCTTCTTCTG 1920 

TGACAGATGT AGCACTTGCA GCGCTTCTGG ATTCAGATGA GGAACGAATG ATCACACACT 1980 

TTGAAAGGTC GAAAATGGAG TAAGGAATGG GA-.GATTTGC AGTTAAAGAT GGCTACCATC 2Q40 

AGGGAAGAGA TCAGCATCTG TGTCAGTCTT CTGTACGGGT GCATGGGATT AAAGGAACCA 2100 

-ATGACATCGT GATCTGTTCG TTGATCTTTG GGCATTGGAG TTGGCGAGAG GTGTCAGAAG 2150 

AAAGAGAACA TCTTACTGAA AACAAGTTCA TAAGATGAGA AAAATGTACG AGCTTCTTAT 2220 

. TTACAACACT GCTGCCCCCT TTCCTCCCAG AGTCTGACAT GGATGTTCAT GCAACTTAAG 22 30 

TGTGTT'GTTC GTGAACTTTG TGTAATGTTT CATTTTTTAA ATCTGA.GAAA. CTAAAAAGTT 2340 

TAACGTCTTC TAAAAGArTG TXATCAACAC CATAATATGT AATCTCCAGG AGCAACTGCC 2400 

TGTAATTTTT ATTTATTTAG GGAGTTACAT AGGTGATGGG GGAAATTGTT AACTACCTTT 2460 
CATTTTCCTG GGAAGTCAAG GTTACATCTT GCAGAGGTTG TTTTGAGAAA AAAGGGCCGT . 2520 

TCTGAGTTAA GGAGCC\TAG TTCTATCAAT, GATGAAAAGA AAAAAAAAAA AACTCGATCG 2530 

GGACGAGGGG GGGCCCGGTA CCCAATTCGC CGTATGGGAN TCGAATGAGA CG 2532 



(2) INFORMATION FOR SEQ ID NO : 111: 

" ( i ) SEQUENCE. CHARACTERISTICS : 

(A) LENGTH: 2249 base pairs 
(3) TYPE: nucleic acid 

(C) STPANDEDNESS : double 

(D) TOPOLOGY: linear 

(;ci) SEQUENCE DESCRIPTION: SEQ ID MO: 111: 



GAATTCGGCA CGAGCTCACG GTGCTGCGTG ACACAAGGGC AGCGTGCGGG TACGAGCCCA 60 

TGGACTTTKT RATGGCCCTC ATCTACGACA TGGTACTGSW TGTGGTCACC CTGGGGCTGG 120 

CCCrCTTCAC TCTGTGGGGG AAGTTGAAGA GGTGGAAGCT GAACGGGGGC TTCCTCCTCA 130 

TCACAGCCTT CCTCTGTGTG CTCATCTGGG TGGGCTGGAT GACCATGTAC CTCTT C GGCA 240 

ATGTCAAGCT GCAGCAGGGG GATGCCTGGA ACGACGCCAC CTTGGC GATC ACGCTGGCGG 200 
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CCAGCGCTGG GTCTTCGTCA TCTTCCACGC CATCCCTGAG ATCCAGTGCA CCCTTCTGCC 360 

AGC C CTGCAG GAGAACACGG CCAACTACTT CGACACGTCG CAGCCCAGGA TGGGGGAGAC 420 

5 

GGCCTXCGAG GAGGACGTGC AGCTGCOGCG GGCCTATATG GAGAACAAGG CCTTCT CCAT 430. 

GGATGAACAC AATGCAGCTC TCCGAACAGC AGGATTTCCG AACGGCAGCT TGGGAAAAAG 540 

" 10 ACGCAGTGGG AGCTTGGGGA AAAGACCGAG CGCTCCGTTT AGAAGCAACG TGTATCAGCC 600 

AAGTGAGATG GCCGTCGTGC TCAACGGTGG GACCATCCCA ACTGCTCCGC CAAGTCACAC 660 . 

AGGAAGAMAC CTTTGGTGAA AGACTTTAAG TTCCAGAGAA TCAGAATTTC TGTTAGGGAT 720 

15 

TTGCCTCCCT GGCTGTGTCT TTCTTGAGGG AGAAATCGGT AACAGTTGCG GAACCAGGCC 730 

GCCTCACAGC GAGGAAATTT GGAAATCCTA GCCAAGGGGA TTTCGTG7AA ATGTGAACAC 340 

20 TGACGAACTG AAAAGGTAAG ACCGACTGCC CGCCCCICCC CTGCCACACA CA.GAGACACG 900 . 

TAATACCAGA CCAACCTCAA TCCCCGCAAA CTAAAGCAAA GCTAATTGGA AATAGTATTA 960 

GGGTCACTGG AAAATGTGGC TGGGAAGACT GTTTCATCCT GTGGGGGTAG AACAG AAC CA 1020 

25 

AATTCACAGC TGGTGGGCCA GAGTGGTGTT . GGTTGGAGGT GGGGGGCTCC CAGTCTTATC 1030 

ACCTCTCCCC AGGAAGTGGT GGACCGGAGG TAGCCTCTTG GAGATGACCG TTGCGTTGAG 1140 

30 GACAAATGGG GACTTTGGGA. CCGGCTTTGC CTGGTGGTTT GGACATTTGA GGGGGGTCAG 1200 

GAGAGTTAAG GAGGTTGTGG GTGGGATTCC AAGGTGAGGC CCAAGTGAAT CGTGGGGTGA 1260 

GGTTTATAGG GAGTAGAGGT GGAGGGACCG TGGCATGTGC CAAAGAAGAG GCCCTCTGGG 1220 

35 

TGATGAAGTG ACGATCAGAT TTGGAAAGTG ATCAACGACT GTTCGTTCTA TGGGGCTCTT -1330 

GCTCTAGTGT CTATGGTGAG AACACAGGCC CCGGCCCTTC CCTTGTAGAG CGATAGAAAT 1440 

40 ATTCTGGCTT GGGGCAGCAG TCCCTTCTTC CCTTGATCAT CTCGCCCTGT TCCTACACTT .1500 

AGGGGTGTAT. CTCCAAATCG TGTCCCAATT TTATTCCCTT ATTCATTTCA AGAGGTCCAA 1550 

TGGGGTCTCC AGCTGAAAMS CCCTCCGGGA GGCAGGTTGG AAGGGAGGGA CCACQGCAGG 1620 

45 

TTTTCCGCGA TGATGTGAC G TAGGAGGGGT TGAGGGGTTC CGACTAGGAT GCAGAGATGA 1530 
CGTCTCGCTG GCTCACAAGG AGTGACACCT CGGGTCCTTT CCGTTGCTAT GGTGAAAATT . 1740 
50 CCTGGATGGA ATGGATCAGA. TGAGGGTTTG TTGTTGCTTT TGGAGGGTGT GGGGGATATT ' 1300 

TTGTTTTGGT TTTTCTGCAG GTTCGATGAA AACAGGGGTT TTCCAAGCCC ATTGTTTCTG 1360 

TCATGGTTTG CATCTGTCCT GAGCAAGTGA TTGCTTTGTT ATTTAGCATT TGGAACATCT 1920 

55 

CGGCCATTCA AAGCTCCCCAT GTTCTCTGCA CTGTTTGGCC AGCATAACGT CTAGGATCGA 1980 

TTCAAAGGAG AGTTTTAACG TGACGGCATG GAATGTATAA ATGAGGGTGG GTCCTTCTGC 2040" 

60 AGATACTCTA ATCACTACAT TGCTTTTTCT ATAAAACTAC CGATAAGCCT TTAACCTTTA 2100 
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AAGAAAAArG AAAAAGGT72 A '-..-..iGGGG GC 2GGGGG AG GACTGACCGC TTCATAAGGC 2160 
AGTACGTCGG AGG7GAG7AT GGT 7GAAGAA AGCTTTTGAT ATTTCTCAAA AAAAAAAAAA 222Q 
AAAAAMCCCG GGGGGGGGi:: CGGA1GG GG 2249 
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(2) I^IFC^IATICN 7C7. 22 HG : 212: 

(1) 5ZQUZ::CZ CHA7AG7Z?^3T2G3 : 

(A) l^IGGX: 213 3 rajs-a pairs 
(2) 7V?E: -uci=l= acid 
(C) 2T?A-Xir*IZ£3: dcubie 
(G) TOPCLCCTf: linear 

(xi) 53QUZ*7CZ CZ3G?G rTlC*r : £ZQ ID MO: 112: 

GATA.CTATAA GZK1AAGTGAC 

AATTCGGGAG AC 




GANCOGAAAA. TGATGAAAGT 



rrGAAG aggccgaaga aaaggaggaa. 



GCGAGAA.TA.G C7CCGTCCAG GAG2GTAAGG AAGAAATCTC TAAACGTTTT 
CTGA.CCAACT " TGTGTTGArA. TTTGGTGGAA AAAJITTTGAA. AGATCAAGAT 
AGCATGGAAT GCAGGA7GGA CTTAGTGTTC ACCTTGTCAT TAAAAGA.CAA 
AGGATGATTG A2<TTGAGGAA AGAAATACAG CTGGAAGCAA TGTTACTACA 
CTAATAGTAA CTGGA.GA7GT GGT72TGGTA CTAGCAACCC TTTTGGTTTA 
GGGGACTTGG AG3GGTGAGT AGGT7GGGTG TGAATACTAG CAACTTCTCT 
GTGAGATGCA CGGACAAC7T TTGTCTAACG CTGAAATGAT GGTGGAGATC 
CCYTTGTTCA GAGGATGGTG 1 rTCAAATG CT GACCTGATGM AGACAGTTAA 
TCCACAAATG C-GGAGTGGA TAjG-GAGAAA TCCOjGAAAT- TAGTGATATG 
CAGATATAA.T GAGACAAA1G TTC-GAA.CTTG CGCAGGAATG CAGCAATGAT 
ATGAGGAAGG AGGACCGAGG TTTGAGGAAC GTAGAAAGCA TCGGAGGGGG 
TTAAGGCGCA GGTACAGAGA TAGGCAGGAA CCAATGCTGA GTGCTGCACA 
GGTGGTAA.TG GATTTGGT7G CTTGGTGAGC AATACATCCT CTGGTGAAGG 
TGCGGTACAG AAAATAGAGA TCC-iZTAGCC AATCCATGGG CTC CACAGA.C 
XCATCAGCTT CGAGGGGCAC TGCGAGCACT GTGGGTGGGA CTACTGGTAG 
GGCACTTCTG GGGAGAGGAC TAG7G2GGCA AATTTGGTGC CTGGAGTAGG 
TTCAAGACAC CAGGAATGGA GAGG7TGTTG CAAjGAAAGAA CTGAAAACCG 



CGGGTGCAGG 
TGCCTCCGCG 
TTCGCCGTGG 
AAATCACATA. 
ACCTTGAGTG 
AACAGGCGTG 
TGATCAACTC 
GGTGGCCTTG 
GAACTACAGA 
ATCGAAAAWC 
TTATGGCGAA 
TTGAATAATC 
GCAGGAGATG 
ATATAATGGT 
AGAGCAGTTT 
TAGTCAACCT 
TTCGCAGAGT 
TAGTGGCAGT 
AGCTAGTATG 
ACAACTTATG 



60 
120 
130 
240 
300 
360 
420 
430 
540 
500 

seo 

720 
730 
340 
900 
960 
1020 
1080 
1140 
1200 
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CAAAACATGT TGTCTGCCCC CTACATGAGA AGCATGATGC AGTCAGTAAG CCAGAATCCT 1250 

GACCTTGCTG CACAGATGAT GCTGAATAAT CCCCTATTTG CTGG.iAA.TCC TCAGCTTCAA 1320 

GAACAAATGA GACAACAGCT CCCAACTTTC CTCCAACAAA TGCAGAATCC TGATACACTA 1330 

TCAGCAATGT CAAACCCTAG AGCAATGC AG GCCTTGTTAC AGATTCAGCA GGGTTTACAG 1440 

ACATTAGCAA OGGAAGCCCC GGGCCTCATC CCAGGGTTTA CTCCTGGCTT GGGGGCATTA " 1500 

GGAAGCACTG GAGGCTCTTC GGGAACTAAT GGATCTAACG CCACACCTAG TGAAAACACA 1550 
AGTCCCACAG CAGGAACCAC TGAACCTGGA CATCAGCAGT TTATTCAGCA GATGCTGCAG . 1520 

15 GCTCTTGCTG GAGTAAATCC TCAGCTACAG AATCCAGAAG TCA.GATTTCA GCAACAACTG 1530 

GAACAACTCA GTGCAATGGG ATTTTTGAAC CGTGAAGCAA ACTTGCAAGC TCTAATAGCA 1740 

ACAGGAGGTG ATATCAATGC AGCTATTGAA AGGTTACTGG GCTCCCAGCC ATCATAGCAG 1300 

20 ' 

CATTTCTGTA TCTXGAAAAA ATGTAATTTA TTTTTGATAA CGGCTCTTAA ACTTTAAAAT '1360 

ACCTGCTTTA TTTCATTTTG ACTCTTGGAA TTCTGTGCTG TTATAAACAA ACCCAATATG 1920 

25 ATGCATTTTA AGGTGGAGTA CAGTAAGATG TGTGGGTTTT TCTGTATTTT TCTTTTCTGG 1930 

AACAGTGGGA ATTAAGGCTA CTGCATGCAT ■CACTTCTGCA TTTATTGTAA TTTTTTAAAA . 2040 

ACATCACCTT TTATAGTTGG GTGACCAGAT TTTGTCCTGC ATCTGTCCAG TTTATTTGCT 2100 

30 

TTTTAAACAT - TAGCCT ATGG TAGTAATTTA TGTAGAATAA AAGCATTAAA. AAGAAGCAAA ' 2150 

AAAAAAAAAA AAAAATTCCT GCGCCCGCGA ATTCTTCT 2193 

35 

(2} INFORMATION FOR SEQ ID NO: 113: 

40 <i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH : 1043 base pairs 
( 3 ) TYPE: nucleic acid 

(C) STRANDEDNE3S : double 

(D) TOPOLOGY : linear 

45 . . 

(xx) SEQUENCE DESCRIPTION: 3EQ ID MO: 113: 

CTGAAGTGTA TGTGGTGAGG AAGAAGAGGC TCCTACTGTA GACAGCCTTG TTCTACAGAT 50 

50 CCTCCCAGAA ATCTCTGGGC C^GGTGGAAC CCAGGGTCAG AGAGGGATGG GAGAGAGGTT 120 

TAATTTTCCA TGATAAATAA AAATCTATAA AA.TAATAAAC AAGAGAAAAG AGATTCGAAA 130 

CAGCCAGGTT GGAGCAGTGA GTGAGTAAGG AAACCTGGCT GCCCTCTCCA GATTCCCCAG 240 

55 

GCTCTCAGAG AAGATCAGCA GAAAGTCTGC AA.GACCCTAA GAACCATCAG CCCTCAGCTG 300 

CACCTCCTCC CCTCCAAGGA TGACAAAGGC GCTACTCATC TATTTGGTCA GCAGCTTTCT 360 

60 TGCCCTAAAT CAGGCCAGCC TCATCAGTCG CTGTGACTTG GCCCAGGTGC TC<L-GCTGGA 420 
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RGACTTGGAT GG G TTTGAGG GTTACTCCCT GAGTGACTGG CTGTGCCTGG CTTTTGTGGA 43 Q 

AACK^ACTTC AACATATCAA AGATWAATGA AAATGCAGAT GGAAGCTTTG ACTATGGSCT 540' 

5 

CTTCCAGATC AACAGCCACT ACTGGTGCAA CPA.TTATAAG AGTTACTCGG AAAACCTTTG 600 

CCACGTAGAC TGTCAAGATC TGCTGAATCC CAACCTTCTT GCAGGCATGC ACTGCGCAAA 550 • 

1-0 AAGGATTGTG TCCGGAGCAC GGGGGATGAA CAACTGGGTT AGAATGGAAG KTTGCACTGT 720 

TCAGGCGGGG CACTCTTCTA CTGGCTGACA GGATGCGGGG TGAGATKAAA CARGGTGCGG 730 

GTGGACCGTG GARTCATTCC AAGACTCCTG TCCTCACTCA RGGATTCTTC ATTTCTTCTT 340 

15 

CCTACTGGCT CCACTTCATG TTATTTTCTT CCCTTCCCAT TTACAACTAA AACTGAC GAG 900 

- AGCCCCAGGA ATAAATGGTT TTCTTGGCTT CCTCCTTACT CCCATCTGGA CGCAGTCGCG 960 

20 TGGTTCCTGT CTGTTATTTG -TAAACTGAGG ACCACAATAA AGAAATCTTT ATATTTATCG 1020 

AAAAAAAAAA AAAAAAAACT CGA' 1043 



25 



(2) INFORMATION FOR SEQ ID NO: 114: 



(i) SEQUENCE CHARACTERISTICS: 
30 (A) LENGTH: 703 base pairs 

(3) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY : linear 

35 (xi) SEQUENCE DESCRIPTION: SEQ ID MO: 114: 

GAATTCGGCA CGAGTGCGCG GGCACCACGG CGGTTTTTCG ACGCTGGC-GG TGG ACGCAGG 60 

CAGCATGGAC CACGGTTGCT GGGCGGATGG GGAGCGTCTA TGGTCAGTTG CCTTAGAAGT 120 

40 

GGTGAGATGG GAAGCTGCAG TTGGAAGACC CTGGAGGATG CCTGACAAGG GGATGTCTGA 130 

CACATGATTG GAGCTCTTTT TGAAATGTTT CTTGCCCTTC CTGGAGCAGA GGAGCCATTA 240 
45 v TTTATGCAGG TACATCGAAG TCTTTTGACC TC-CATACAGT GATTATGCTT GTCATCGCTG . 300 

GTGGTATCCT GGCGGCCTTG CTCGTGCTGA TAGTTGTCGT GCTCTGTCTT TACTTCAAAA 3 50 

TACACAACGC GCTAAAAGCT GCAAAGGAAC CTGAAGCTGT GGCTGTAAAA AATCACAACC 420 

50 

CAGACAAGGT GTGGTGGGCC AAGAACAGCC AC~GGCAAAAC CATTGCCACG GAGTCTTGTC 430 

CTGCCCTGCA GTGCTGTGAA GGATATAGAA TGTGTGCCAG TTTTGATTCC CTGCCACCTT 540 

55 GCTGTTGCGA GATAAATGAG GGCCTCTGAG TTAGGAAAGG TGGGCAOAA AATCTTCATG 600 

AGCAATACTT CTTAGTAGAT TGTTTTGTTA TTCAAATCAA GTTCTAGTGT TTTTATGTGA 660 

GATTATATAA. TTT -jCAGTGT TGTTTTATAT ACTTITGAAT AAA 703 

60 
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10 



15 



20 



25 



30 



35 



40^ 



45 



50 



CO 



60 



(2) INFORMATION rOR SEQ ID NO: 115: 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 3634 case pairs 
(3) TYPE : nucleic acid 
(C) STHANDEDNES3 : dcubia 
■(D) TQPGLCGY : linear 

(:-l±) SEQUENCE DESCRIPTION :' SEQ ID NO : 115: 

GGCAGAGGGG GCATGAGCAG GAGGAGGATT ACCGCTACGA GGTGCTCACG GCCGAGCAGA 50 

TTCTACAACA CATGGTGGNA ATGTATCCGG GAGGTCAACG AGGTCATCCA GAATCCAGCA 120 

ACTATCACAA GAATACTCGT TAGCCACTTC AATTGGGATA AAGAGAAGGT AATGGAAAGG 130 

TACTTTGATG GAAACCTGGA GAAGGTCTTT _ GCTGAGTGTC ATGTAATTAA TCCAAGTAAA 240 

AAGTCTCGAA CACGGCAGAT GAATACAAGG TCATCAGCAC AGGATATGCC TTGTCAGATC 300 

TGCTACTTGA ACTACCCTAA, CTCGTATTTC ACTGGGCTTG AATGTGGACA TAAGTTTTGT 360 

ATGCAGTGCT GGAGTGAATA TTTAACTACC t AAAAT AATGG AAGAAGGCAT GGGTCAGACT 420 

ATTTCGTGTC CTGCTCATGG TTGTGATATC TTAGTGGATG ACAACACAGT TATGGGCCTG 480 

ATCAGAGATT CAAAAGTTAA ATTAAAGTAT CAGCATTTAA TAACAAATAG CTTTGTAGAG 540 

TGCAATCGAC TGTTAAAGTG GTGTCCTGCC CCAGATTGCC ACCATGTTGT TAAAGTCCAA 60 Q 

TATCCTGATG CTAAACCTGT TCGCTGCAAA TGTGGGCGCC AATTTTGCTT TAACTGTGGA 660 

GAAAATTGGC ATGATCCTGT TAAATGTAAG TGGTTAAAGA AATGGATTAA AAAGTGTGAT 720 

GATGACAGTG AAACCTCCAA. TTGGATTGCA GCCAACACAA AGGAATGTCC CAAATGCCAT 730 

GTCACAATTG AGAAGGATGG TGGTTGTAAT CACATGGTCT GTCGTAACCA GAATTGTAAA 940 

GCAGAGTTTT GGTGGGTGTG TCTTGGCCCA TGGGAACGAC ATGGATCTGC CTGGTACAAC 900 

•TGTAACCGGT ATAATGAGGA TGATGGAAAG GGAGCAAGAG ATGCACAGGA GCGATCTAGG 960 

GCAGCCCTGC AGAGGT AC CT GTTCTACTGT AATCGCTATA TGAACCACAT GCAGAGCCTG 1020 

CGCTTTGAGG ACAAACTATA TGCTCAGGTG AAACAGAAAA TGGAGGAGAT GCAGCAGCAC 1080 

AACATGTGCT GGATTGAGGT GCAGTTCCTG AA.GAAGGCAG TTGATGTCCT CTGCCAGTGT 1140- 

CGTGCCACAC TCATGTACAC TTATGTCTTC . GGTTTCTACC TCAAAAAGAA TAACCAGTCC 1200 

ATTATCTTTG AGAATAACCA AGCAGATCTA GAGAATGCCA CAGAGGTGCT GTCGGGCTAC 1260 

CTTGAACGAG ATATTTCC OA AGATTCTCTG C^GGATATAA AGGAGAAAGT ACAAGACAAG 1320 

TACAGATACT GTGAGAGTCG ACGAAGGGTT TTGTTACAGC ATGTGGATGA AGGCTATGAA 1380 

AAAGATCTGT GGGAGTACAT TGAGGACTGA GAATGGGC C T GCATAAAA.TG AACTCTGAAA. 1440 
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568 

ACTTTACCAT CTAGAGTCCT CATGCAATTA AAACAAAACA AACACAAACA AGGAGGC-.CT 1500 

AAGCCTATTC TGACACCACT GCTCTGTAGT ACCAGAATTG TTTTGTTAAT GGAAAGTTTA 1560 

, AGTAAATTAT ATTGTAATAA AAAGGTAGA.T AAACCATTGT ACAACAGTAT TCTAGGCCGC 1€20 

CAA.CAAAA.GT GTGACAGACA CACTAAAAGC CCTCCAAjCTT TAACTTGTAA CGTAGCTTCA 1530. 

TTCTCAAAGC TGAGTCCTTT TTTTTCTTTT TCCTTTTCCT GAGTGTAGTA CAGTTAAAAT 1740 

TTCAAACAGC TCCTTGACAC TGCCTTTCAT GTTCAAACCA GCCATTTTGT TGTACTTTGG 1300 

TAAAGGACCT CTTCCCCTTC CTCCGCTACA CATACAGATA CACCCACACA CAGACTGACT 13 SO 

CTCrtTCTCT CATACCCCAA GGTCATGAGT GAATGATGCT TAGTTCCTTG TA AAGAAAAT 1920 

CTTGGGATGG .GGAAAGGGGT AGGCAGCAAG AGGATTCAAC AAACGAAAAA CATAAAAACT 1980 

TTGTATATGA CTTTTAAAAC AAGAGGACAA CACAGTATTT TTCAAAATTG TATATAGCGC 2040 

ATATGCATGG ACAAAGCAAG CGTGGCACGT GTTTGCATAA TGTTTAATTA CAAAAAAATA. 2100 

TTTA.TTCTTT AAAAATCTTG AAGATTATGT CTATTTGCTG TGCATTTTCT TTCAGTTTGC 2150 

TTATCTTTCC CGGGTTGGGG TTGGGATAAA. ^GGTGTGTCGG TTTAGCACCT CTGGAAGACC 2220 

TATCTAGAGC TCTTTCACTT TCCTGAGGTT ATTTTGCCCY TTCTGGTGTT GGTATGTCTG 2230 

TTGCCGGGCA TGGGCTMCAY GGCTTGAATT CCTGCTCTTG ATCAGGGACA AGGGAGGTCA 2340 

AGCTCTGACT AATGCCATGA CCTGATTAAG GGGTA.CAGCA GGGAGTTTTG TTGCTACAGC 2400 

TCATGAATTA ACCTGTCCCA ACCTAATCCC CCTCCATGGC ATGATGCCTC TACCCAAGCC 2460 

TTTGTGTGCC CATGTTATGC ACACAGGTGT AGGCATTCTT AAGTGCCCTG TCGCATCCAG 2520 

TGGAAGCATT TTAAAATTTC TTTTACTTTT TGGTTTTCCC TTAATTGCTG CTTTTCAGAT 2530 

TTTAGTTATG GCTCGTCTGC TCACCCCTTC TCTAGATTAG GGTGTCAAAG AGAATGTTTT 2S40 

GCTTTAAATA TAAATAGGGA _ TTGATTTAGT GTCAGATTGT GAATTTAAAA TGGTGGATAC 27Q0 

CGAAATTGCT TGTGTGTGTT GCTGTGGGTT TGGTTTGAAG GCAAACACCC CTAGAACATG 2760 

ATATTCCCAT CTAGTGCATT TAAATAGAAA TCACTGAGTT TGCTGCTTTT TTATTGTCAG 2320 
. CAGAT AGG AG AATTAATAAT GCATTTTAGC TGTGATGTCC ATTTTTATGA AATTCCTACT .. 2330 

AAGAGCTATG TTAAAAGTAA AGGATGGTGG TGGTTGTATT AACTATATAC CTGTTTAGGC 2940 

CATTCTGGCT GTGGTATTTT TCAATAGGTC AGCATCTGTA AATCTGTCAG TTT7ATACAG 3000 

GAGTGCAGAG TGAACTAGGG AACTAGATTA AGAGGTGTAA ATATGAAATA CCAGTTGAGG 3060 

CTGAGGACCT CTTCGTCTTC CTTTAAATGT CTTTTGCCTA GGGAGTGTTT ACCATTTGTG 3120 

AGGCAGCTTT GTCTGGTCTT ACACTGTACA TCCTATTACT CCATTGGGAA GTAGGTTCAC 2180 

TTTCCTCTGG CCTTTTGCCT AAGTTAGGCT TTGGTGAATC AACCCTACTT TTCCTTTTAG 3240 



WO 98/54963 



369 

AAAAGGTXST TACAGGAGAT TTACTG5CAA CTGTTCTTTT CCOVTCAAAA ATCAGTGAAT 
GTTTGCTGAG TATAAATGCT GCTTCCTTAA ACCACTTGTC GCTTTAGGAT OACTTTACC 
TGTACCTTTT CTCCTTTCCT CCCTTGCCA.C CTCAGGTGCA AATCTGAACT CAGTGTCTGC 
TTCTTCCATT TTCTCGTCTC TCTCCCCTCT TCCCCCATTA TCCATATGAC ATTATTTTAC 
TTCAAATGAC AGCATCAATC TTAAAAAGAT ATACATTAAA ACTAAGGAGT TTTTTTAAAG 
AAAGCCTGAA T AAGTTC CTT TCCCTGGTAA CTTTGAAAAG CAGTCAGAGT TGCTATATAG 
ATATATGTGG CTCCTTTAAA ATG C TTTGTG TATGTGTGGT GTTTAAAAAA .^A^AM 
TTCGGGGGGG GGCCCGGTNC CCAT " 

(2) INFORMATION FOR SEQ ID NO: 115: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1565 base pairs 
(3) TYPE: nucleic acid 

(C) STRANDEDNESS : double ' 

(D) TOPOLOGY: linear 

Ui) SEQUENCE DESCRIPTION: SEQ ID NO: II" € : 
AAGAAAGGGT ATTAAAATTC TAGATCACAT ATGGACCCGG GAAGGTTTTT MACCCTCTGT 
TAGTGACATC GAGTCTCCCA CTAGACAAAA TAGGTGGAAA AATCTCTCGA GGGCTCACAT 
TGTTTTGTCA TCTTCAGGAA AAACACCACC AGGCCATACC ACAGCCTGCC CAGTGAGGCG 
GTCTTTGCCA ACAGCACCGG GATGCTGGTG GTGGCCTTTG GGCTGCTGGT CCTCTACATC 
CTTCTGGCTT CATCTTGGAA GCGCCCAGAG CCGGGGATCC TGACCGACAG ACAGCCCCTG 
CTGCATGATG GGGAGTGAAG CAGC^GGAAG GGGCTCCCAA GAGCTCCTGG TGGTGCAGCC 
TGTGCTCCCC TCAGAAGCTC TGCTCTTCCC AGGGCTCCCG GCTGGTTTCA GCAGGCGACT 
TTCTTCCAAT GCTGGGCCCA GACTTCTTGC CTGGGTGCTG GCCTGCCCTC TCCGGNCCGC 
TTGCTGCCTG TCTGCTTTCC TTGGTGGYTT TGCTGGGTGC TGGGCCTGCC CTCTCCGGCC 
GCTTGCTGCC TGTCTGCTTT CCTTGGTGGC TTTGCTGGGT GC^TGGGCC'TG CCTTCTCTGG 
CTGCTTGCTG CCTCTCTGCT TTCCTTGGTG GCTTTGGCTT CTGCACTCCT TGGCGTCASC 
TCTCAGGTCC TCCATTCACA CGAGGTCCTC CTCGCTCTGG CCGCTCTTGC TGCTCCTGTC 
TGAAGAWATC AGACTGATTT CCTCTTAAGA CTCCTAGGGA TGTGGTGAAG AGCTGGGACT 
CAAGTGCAGT CCACGGTGTG AAACATGAGG GARGTGAGGT GTCCGTCCAC TTCCCCCATA 
AAGGTGTGCA TTTCAGTTAG GCTCCCCCGC CACAGAGCAG GCTTCATCTG CTCTGCCATC 
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CAGCCCCATC TGGATGTGAG GTGGGGTGGA GACATCATGG GGTGATTGCA. GAAAGGGGGA 960 

GTGGCGGCCC ACGCAGCTTC TGCTGAGGAG CTGACCGCTC TGAGCTGTTC TGTTTCGTAT 1020 

5 TGCTGCTCTG TGTCTGCATG TATTGTGACC GTGCGGCTCC ACCTCTTCCA GCTGCTGCTA 1080 

CAGCTGAGGC CTGGATCCCG GCCTTTCCCT GTGACTTACG TGTCTGTCAC CGGCANGCAG 1140 
CCCTACAAAT CCTGGTGACC TGCTCTCCCA AGAACAGAGC CTGTCOCCAG ATGTCGCAGT ' T?G0 

AGCGATGAGT AACAGAGGTG GCTGTGGACT TCCTCTACTT CTCCTTGCTG GATCAGGGCC 1260 

TTCCTGCCTC C CGCTGGGCA GGTCTGGCCT TGCTCTCTTG GCAGGGCGCC AGCCCCTGTG 1320 

15 ACGACTCTGG AGCTCAGCAT GCAGCTGATG CCAAAGTTGT GGTGTCCA<£t GTGCAGCAGC 1330 

.CCTGGGAGCC ACTGCCACCT TCAGAGGGGT TCCTTGCTGA GACCCACATT C-CTTCACCTG 1440 

GCCCCACCAT GGCTGCTTGC CTGGCCCAAG CTAGCGTTCT GTGCCATGCT AGAGCTTGAG 1500 

20 

CTGTTGCTCT TCTTCAGGGG AGGAAATAGG GTGGAGAGCG GG AAGGGTCT TGCTCCTAAG 15 60 

TGTTGCTGCT GTGGCTTTTT TGCCTTCTCC AAAGACGCAC TGCCAGGTCC CAAGC TTCAG 1520 

25 ACTGCTGTGG TTAGTAAGGA AGTGAGAAGG CTGGGGTTTG GAGCCCACCT ACTCTCTGGC 1530 

■ AGCATCAGCA TCCTACTCCT GGCAAGATCA GGCCAACGTC CACCCZ^CCC TCACATTGGC 1740 ' 

AGATGTTGGC AGAAGGGGTA ATATTGACCG TCTTGACTGG CTGGAGCCTT CAAAGCGACT 13 QQ 

30 

GGGATGTCCT CCAGGCACCT GGGTCCCATG ACCAGCTCCC CGTCTCCATA GGGGTAGGGA 1350 

TTTCACTGGT TTATGAAGGT CGAGTTTCAT TAAATATGTT AAGAATCAAA GCTGTCTTTG 1920 

35 TTCAGGCTGC TATAACAAAA. ATATAATAGC CTGGGTGGCT TAAAG 1965 



40 (2) IMFQRMAT ION FOR SEQ ■ ID NO: 117: 

( i) SEQUENCE CHARACTERISTICS : 

(A} LENGTH: 503 base pairs 
(3) TYPE: nucleic acid 
45 '(C) STPANDEDNESS : double 

(D) TOPOLOGY: linear 

Cxi) SEQUENCE DESCRIPTION: SEQ ID MO: 117: 

50 • AGTGATCCCC TTGCCTCGGC CTCCCAAAAT GCTGGAATTG TAAGCGTGGG CCTCTGCACC 60 

CGGCCTGGTC CGCAATTTAA AAACGCACAG CCACC1ATTCC CTYTCCAGAA AGCACCCAGA 120 

TGCCTTTGGG AGAAOCAGCC TCCTCCATGG AGGAAAGCTT GGGATCTGCC TTCCCACCTG 130 
55 - 

GGGAGGAGAG GGATCTGTGG AAAATCCTTC TGACGGACTT CCCCTCAGTG CCTGATCCAT 240 
ACTCAATAGT AGAAAAAGTA AGAAATATAC AAAGATAGCA. GATACACGGA GACAGTTCCC 300 
60 CAAATAGCTG AGCGAWTAGC GCAGAAGCAA TATTGAAGAC CTAA.TAGCTG AGACATTTCC 360 



# 
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AGAACTGATA AACTGCATCC AGCGACAGA? CAAGCAGCCC AGAAAATTCC AGGCAGCA.TC 420 

AACAAATAAA 7AGCCCCACA TGCACCCGTG AAAATGCAGA AGACCAAACA AAAAAGTCCG 430 

5 

GTCAACAGCC AGAGTTAAAG AGG 503 



10 

. (2) INFORMATION FOR SZQ ID NO : ~ 113 : 

(i) SEQUENCE CHAPACTE?^STIC3 : 

(A) LENGTH: 112 3 base pai-3 
15 £3) TYPE: nucleic acid 

(C) STRANUEDNES3 : doubia 
• (D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 118: 

20 

GGCACAGCTT GGAATGAACC CCTGTGGATA. AGGGGGACTA TTAGATAGAA TAAACATCAA 60 

TAAA.TGCTTG ATGAATAAAC GCTAATCCTA CCTTCCCAGC CTGACACCTC CCAGTGGACA 120 

25 CCACACTTCA CTTGAAGCCT TAGAAACCTT TCCCACCCAT GCTTCC=C-CC CTGGCTTCAT 130 

GTTGCCATTT CTCACCCCCA GAACAGGCCG CCCGCCTGAA GAAACTA.CAA GAGCAAGAGA. 240 

AACAACAGAA AGTGGAGTTT CGTAAAAGGA TGGAGAAGGA GGTGTCAGAT TTCATTCAAG 300 

30 

ACAGTGGGGA GATGAAGAAA AAGTTTCAGC CAATGAACAA GATCGAGAGG AGCATACTA.C 350 

•ATGATGTGGT GGAAGTGGCT GGCCTGACAT CCTTCTCCTT TGGGGAAGA.T GATGACTGTC 420 

35 GCTATGTCAT GATCTTCAAA AAGGAGTTTG CACCCTCAGA TGAAGA.GCTA GA.CTCTTACC 430 

GTCGTGGAGA GGAATGGGAC CCCCAGAAGG CTGAGGAGAA GCGGAACNTG AAGGAGCTGG 54Q 

CCCAGA.GGCA AiMGAGGAGGA GGCAGCCCAG • CAGGGGGCTG TGGTGGTGAG CCCTGCCAGC 600 

40 - 

GACTACAA.GG ACAAGTACAG CCACCTCATC GGCAAGGGAG CAGCCAAAGA. CGCAGCCCAC 650 

ATGCTACAGG CCAATAAGAC CTACGGCTGT KTGCCCGTGG CCAATAAGAG _ GGACACACGC 720 

45 TCCATTGAAG AGGCTATGAA TGAGATCAGA. GCCAAGAACC GTCTGCGGCA GAGTGGGGAA 730 

GAGTTGCCGC CAACCTGCTA GGCGCCCCGC CCAGCTCCCT TTGACCCCTG GGC<L*GQGC*. 340 

' GGGGGCAGGG AGAjGACAAGG CTGCTGCTAT TAGAGCCCAT C-GTGGAGC2C CACCTCTGAA 900 

50 

CCACCTCCTA CCAGCTGTCC CTCAGGCTGG CGGAA-AACAG GTGTTTGATT TGTCACCGTT 960 

GGAGCTTGGA TAXGTGCGTG GCATGTGTGT GTGTGTGTGA GA.GTGTGAAT GCACAGGTGG 10 20 

55 GTA.TTTAATC TGTATTATTC CCCGTTCTTG GAATTTTCTT CCCATGGCGC TGGGGTACTT 1080 

TACATTCAA.T AAATACTGTT TAACCCAAAA AAAAAAAAAA AAAAGAA-AGA. AGN 113 3 

60 
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(2) INFORMATION FOR 3EQ ID NO: 119: 

(i) SEQUENCE CHARACTERISTICS : 
5 (A) LENGTH : 1101 base pairs ' 

(3) t^?e: nucleic acid 

(C) ST?wANDEDNE3 3 : ccuele 

(D) TOPOLOGY: linear 

10 Cxi) SEQUENCE DESCRIPTION: SEQ ID MO: 119: 

GGGOiOJGCT GAAGCTGCAG ACCTCCCCAG GGGATGGCTC CTCTCCCCCA GGAGCCCCGA 50 

GGCAGGGGAG GCAGAAAOCC TGGGCTCTGG GGGGTGGCCT GCGGACAGCT GTGCTGTGGG ' 2 0 

15 

CCGGGGGCTG GGCCTGTCCC ACAGGGNCGT GGAGCTCGTG GTTCTGAGCA GCCAGCTGGG 130 

TGGTGTCTGG GGATAGCTGG GAGGCACAGC GGCTGCCATG TGGGACTGGG ACTGGAGTGC 240 

20 ^ TCCCTGGTCT TGGCCTCTGT GGCTCAGCCT TGCTCTGGTC TGCCTGAGTG CAGGGGCCAA 300 

GGGGCACAGG GCCAGTGAGG CCGGCCACGC TCGGGCCCTC ACCTGTGAGA TGGGGTCGGA 3 60 
ATTTKACACA GCCTANGGCT TGGTTCTTGG TKGTNGAHCG TGGACTYCTK AGAACGGGAG 

25 

TGCTGGTCCT . GAAAGGCGTG GTTGGAGACC AGCTGCTTTT CTCGCTGTTT TTCTCTTAGG 430 

AGATTAAACA AAAACAGAAA GCACAAGACG AACTCAGTAG CAGACCCCAG AGTCTCCCCT 540 

30 TGCCAGACGT GGTTCCAGAC GGGGAGACGC ACCTCGTCCA GAACGGGATT CAGCTGCTCA 500 

ACGGGCATGC GCCGC-GGGCC GTCCCAAACC TCGCAGGGCT CC±GC*.GGCC AACCGGCACC 650 

ACGGACTCCT GGGTGGCGCC CTGGCGAACT TGTTTGTGAT AGTTGGGTTT GCAGCCTTTG 720 

35 

CTTACACGGT CAAGTACGTG CTGAGGAGCA TCGCGCAGGA GTGAGGCCCA GGCGCCGAGA 730 

CCCAAGGCGC CACTGAGGGC ACCGCGCACC AGAGCGTGAC CTCGGCAGGC TGGACA.CA.CT " 340 

40 GCCCAGCACA GGCAGACCCA CCAGGCTCCT AGGTTTAGCT TTTAAAAACC TGAAAGGGGA 900 

AGCAAAAACC AAAATGTGTG ACTC-GGCTTT GGAGGAGACT GGAGCCTCAG CCCTGTCCTG 960 

GCCXCGGGCC GCrrOGGQCTG GTGTGGGTGG GCCTTGTGTG CTGGATTTGT AGCTTATCTT 102Q- 

45 ' * 

CCGTGTTGTC TTTGGACCTG TTTTAGTAAA CCCGTTTTTC ATTTTAAAAA AAAAAAAAAA 1030 
AAACTTTGGG GGGGGGCCCC N 1101 

50 



(2) INFORMATION FOR SZQ ID NO: 120: 

55 (i) SEQUENCE CKAKACTEHI5TXCS : 

(A) LENGTH: 232 case pairs 
(3) TYPE: nucleic acid 

( C ) STRANDEENE3S : double 

(D) TOPOLOGY: linear 
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(xi) SEQUENCE DE3C?.I?TIGN : SEQ ID NO: 120 : 

AGCTTC TCTG TCCAGTCTTG AACTCTGGGS TCTCTTGGAA CTTTCGTCAC CCCTCTCAGC 50 

CIGAATATTC GTTGCATGGA TTGCACTCAA CCAGACTTTG GATCTGTGCC TACTTAATCA 120 

ACCTTATCTT TGCAATATGT TCGGGCCCAC CTTOCACTCC TTGGTTCTTG TTCCTCCTTG 130 

GCCTAACTTG TCCCTTCTCC ACTTCAO.TC CCCGCTGGGA CAGCATTCCT CCTTCCTCCC ' 240 
AACCTCCCTC CGTCTCAPAA AAAA?AAAAA AAAAAAAAAA TT. 232 



12 ) INFORMATION FOR SZQ ID NO : 121: 



( i ) SEQUENCE CHARACTER ST XC S : • 

(A) LENGTH: 253 5 case cairs 
20 (B) TYPE: nucleic acid 

(C) STFANDEDNES3 : double 

(D) TOPOLOGY : linear' 



(:<i) SEQUENCE DESCRIPTION: SEQ ID NO: 121: " 

TAAGGGGGTG TGTGCTCACC TCCTCCTGAC _ # CCTTAACACT CCTGTCCTGC CCAGACCAAC 50 

AGAGAGAGCT GTCCCTGAGA . CCCCGGAGAG _ AAGCAGCTGC_.CGAAAGCXGC AGCGTTTCCG 120 

30 CACTCTGAGA CCATGATCTT CCTCCTGCCA GGGGAGAGCC ACCCACAGGC CATGTCCAGC 130 

CCCACTTCCC- TCAGCCCCCA GGGXTTCCTT CTGGCCCCTC TGAGGATTCC CTAGGGCTGC 240 

CCCGCAGAGG GGTTTCCCCA AGCTGTGTTT TGAAGCCTGC AATGTCGAAA AGTGAGAAGT 300 

35 

CAGAGGGAAC AGGACAGGTG CAGCCGGGCT CTGAGGCCAC ACCTCACACC TCGCTGTTCC 360 
CCAACA.TCCC CTGAGCAGTG TGAGCTCATC TCACCAGATG AGAAGAGGCC CTGTGCATTT . 420 

40 YTTTTGTTTG TTTGTTGCTG TTTTCCCCCA CCCATCCAGT TCTCCTCAGC AAA.GCAAATT 430 

CCTTAACACC TTTGGTGGAG AATTTCTTAC CCAGACTTGG GGCTGTGATG CCGTTCAGTG 540 

CGTGGTGAGT GCAGCGTGTG TGCGTGTGCC TGTGTGTGAA CCTGGGGGCC ATCCTGGTGG .500 

45 

CCTGGGAGCG TGAGGAGAGG CCCCCTGTGT GCTGGGTGAG TGGTGGGTGT GGGGTCAATG 550 

CAGTGAGGCT CTCTGGGTGA GGGTCGCAAC CTGGCAGTCC CCAGCCTCCC A3CATCTGTG 720 

50 AGCGTCTGTT GGACTTTACA GAAGAGCGTC ATCCYGTCTG CCCCTCACTC TGCCCTGGAA 730 

TCAACATCTT CCGAGTCCTT CTTGGGGGAA ATAGCAGAGG CCCACTTAAC TGCATAAACT 340 

GCTTCCCATT CCGCAGCCCA GTTCTGATTG TTGAGGTGTC GCGTCGTTCC AGGTCCCCCA 300 

GTCCCCTCTT TCTCCTGTCC TCTGTCTGTC CTTCACCTCC CCACTCCAGC CGGGGCTCAG $60 

TTCAGGGAAA TGCTGTTCCA YATCAGCCCT CTGCTCTCTG AGGCAGGCGC GCCTCTGACT 1020 

CGGAGCTACT TGAAACTTGT GCTCTTGCTA GGATTGGAGT CTACCTATCT CTTC-CATTTG 1080 
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TCCCAGCTGG AGTTCTGGAA CTTTCCTCCT 
TGGGGGGCCT GGGGAAGGAA GGAGTTCAGA 

TGCTCTCTCT 



CGGGGTGGGG GTGGGGG7TO 



CTTGArGTCA 



TGTGTCCTTG TAAATATGTT TTAGGAAGAA 
GATTGCAGGG GTCCftGCCTT GCCTGTTTCC 
AGACTGGTCC CCTCAAAAGG TAGACAAAAC 
TCAAAGTGGC TTTTTGTTAG ACAAGGTTAA 
TCCTTCCTCA GCTCCTTGAT TTGTGACCTT 
GTGCCCTCTC CTCGATGCCT CCCTCCTTCC 
GGGAATTAGG GCCATGCTGG AAGAAGCTTA 
GCTTGGTCCT GGAACTCCCC TTGGCTGCCC 
AGGTGGATGT CAGATCTGGT ACGTTGCAGC 
TCAGAGAGSG TCCAAGGGTG ATGGAGAAGG 
GTGGTGGGTG GCGGCATCTT GACTGCCCCC 
CYCTTCACTC CAGCCCGCCT GCCTTCAGCC 




AGCAGCTCCC TGTGGAG2TG 
GGTTTC C7CA TGAGCAAT-2T 
GACCAAGGGG CCTGGCAIGC 



TG2AGATCGG 
AG 



rrccA 



CTTTGGAGGG GGTGGGGTCC GTTGGCATCA. 
GAGCCCTCAG CCCCTGGGGA GAACAAA7GG 
GCTGCGGGCT GGCGGCAGTC CCAGGGGAGA 
AGGAAGTTCC CAGCAGAGCA AACTGCTTTC 
TGCAATAACT GAGCTTAGAG TTAGGAATTG 
TTTAACTGCT GAAATTGTAT CTCTCAGTAA 
AAAGTGTTAG ACTGTGTGCG TGTGGGTTGA 
GCCCTGCCTT TCCCCTGCGC CCCCATCCTC 
CCTGCCTCGT GTCGTCTTTA TCTGCCTATT 
ACATGCATAA AGGAAATCAA ATGTTATTTT 
CACCAAAAAA AAAAAAAAAA. ACC CNGGGGG 



(2) 




ACTCAGCCTA. AGGAAACAAG 
TAAGAAAATG GAAAATAAAA 
GGGGCCGGTA ACCCAT7TCG 



ACTTTATAAr- 
CC2AA 



LI40 
1200 
12o0 
1220 
1330 
1440 
1500 
15o0 
1520 
1630 
1740 
1300 
1360 
1320 
1530 
2040 
2100 
2150 
2220 
2230 
2340 
24Q0 
2460 
2520 
2530 
2S35 



FOR S2Q ID NO: 122: 

(i) SEQUENCE CHArA.CTZ?-Z3TIC3 
I A) LENGTH: S94 'c^Sr 
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(5) TY?E: nucleic acid 

(C) STHANDEDNES5 : double . 

(D) TOPOLOGY: linear 

SEQUENCE DESCRIPTION : SEQ ID MO: 122: 
GAATTCGGCA GAGGTTCGGC GAAGATAGGG AATAAGGAA.G CACAGGAGTA GGGGAGAAGG 
AAGCACAGGA GTAGGGGAGA TATACAGCOG TCAGGATAAG C<SGGAAAGGG CGGTGGTTGC 
SCAAGAGGTG AAACnAGATG TGAGAGACAA GGGGTAGGGA AGAAATGGGG CAGCGGTTAG 
GTTCAGAAGC GCATAGACCG TGGCGGACGG GGAATGCGAG GGGCACAGAA AGGAACTGAG 
GGGTGGGCTA TTTTAARGGA GATGGTCCTT CAGCCCTCTT • YTTTTCTGCG TAGTTCTCCT 
CCTCCAGGCC GCOCGCGG^T ATGTCGTCCG GAAA.CCAGCC CAGTCTAGGC TGGATGATGA 
CCCACCTCCT TCTACGCTGC TCAAAGACTA CCAGAATGTC cctggaattg AGAAGGTTGA 
. TGATGTCGTG AAAAG ACT C T TGTCTTTGGA AATGGCC-A.C AAGAAGGAGA TGCTAAAAA.T 
CL^JGCAAGAA CAGTTTA.TGA AGAAGATTGT TGCAAACCCA GAGGACA.CCA C-ATCCCTGGA 
GGCTCGAATT ATTGCCTTGT CTGTCAAGAT CCGCAGTTAT GAAGAACACT TGGAGAAACA 
TCGAAAGGAC AAAGCC CAC A AACGCTATCT' ' GCTAATGAGC ATTGAC CAGA GGAAAAAGAT 
GCTCAAAAAC CTCCGTAACA CCAACTATGA TGTCTTTGAG AAGATATGCT GGGGGCTOGG 
AATTGAGTAC . ACCTTCCCCC CTCTGTATTA CCGAAGAGCC CACGGCCGAT TCGTGACCAA 
GAAGGCTCTG TGCATTCGGG TTTTCCAGGA GACTCAAAAG CTGAA.GAA.GC GAAGAAGAGC 
CTTAAAGGCT GCAGCAGCAG CCCAAAAAOA. AGCAAAGCGG AGGAACCCAG ACAGCCCTGC 
CAAAGCCA.TA CCAAAGACAC TCAAAGACAG CCAATAAATT CTGTTCAATC ATTTJ^AAAAA 
AAAAAAAAAA AAAAAAAAAA AAAAAGGGGA GGGG 

(2) IriFORMAT ION FOR SEQ ID NO: 123: 

(i) SEQUENCE CKAf-ACTEFJISTICS : 

(A) LENGTH : 1542 case pairs • 
(3) 'TYPE : nucleic acid 

(C) STRANDEDMESS : double 

(D) TOPOLOGY; linear . 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 123: 
GGCASAGG-CA. CCTCGGCCCC GGGCTCCGAA GCGGCTCGGG GGCCCCCTTT C GGTCAAC AT 
CGTAGTCCAC CCCCTCCCCA TCCCCAGCCC CCGGGGATTC AGGCTCGCCA GCGCCCAGCC 
AGGGAGCCGG CCGGGAAGCG CGATGGGGGC CCC2CCCGCC TCGCTCCTGC TCCTGCTCCT 
GCTGTTCGCC TGCTGCTGGG CGCCCGGCGG GGGCAACCTC TCCCAGGACG ACAGCCAGCC 




WO 98/54963 



• ♦ 

376 
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CTGGACATCT GATGAAACAG TGGTGGCTGG TGGCACCGTG GTGCTCAAGT GCCAAGTGAA 300 

AGATCA.CGAG GACTCATCCC TGCAATGGTC TTAACCCTGC TCAGCAGACT CTCTACTTTG 260 

5 

GGGAGAAGAG AGCCCTTCC-A GATAATCGAA TTCAGCTGGT TAMCTCTA.CG CCCCACGAGC 420 

TCAGCATCAG CATCAGCAAT GTGGCCCTGG CAGACGAGGG CGAGTACACC TGCTGAATCT 4S0 

10 TCACTATGCC TG7GCGAACT GCCAAGTCCC TCGTCACTGT GCTAGGAATT CCACAGAAGC 540 

CCATGATCAC TGGTTATAAA" TCTTCATTAC GGGAAAAAGA CACAiGCCACC CTAAACTGTC 500 

AGTCTTCTGG ' GAGCAAGC-GT GCAGCCCGGC TCACCTGGAG AAAGGGTGAC CAAGAACTCC 660 

15 

ACGGAGAACC AACCCGCATA CAjGGAAGATC CCAATGGTAA. AACCTTCACT GTCAGCAGCT 72Q 

CGGTGACATT CCAGGTTACC CGGGAGGATG ATGGGGCGAG CATCGTGTGC TCTGTGAACC 73 G 

20 ATGAATCTCT AAAGGGAGCT GACAGATCCA CCTCTCAACG CATTGAAGTT TTATACACAC 840 

CAACTGCGAT GATTAGGCCA GACCCTCCCC ATCCTCGTGA . GGGCC AGAAG . CTGTTGCTAC ■ • 900 

ACTGTGAGGG TCGCGGCAAT CCAGTCCCCC AGCAGTACCT ATGGGAGAAG GAGGGCAGTG 960 

25 

TGCCACCCCT GAAGATGACC CAGGAGAGTG CCCTGATCTT CCCTTTCCTC AA.CAAGAGTG 1020 

ACAGTGGCAC CTACGGCTGC ACAGCCACCA GCAAGATGGG CAGCTACAAG GCCTACTACA 10 SO 

30 CCCTCAATGT TAATGACCCC AGTCCGGTGC CCTCCTCCTC CAGCACCTAC CACGCCATCA 1140 

TCGGTGGGAT CGTGGCTTTC ATTGTCTTCC TGCTGCTCAT CATGCTCATC TTCCTTGGCC 1200 

AGTACTTGAT CCGGCACAAA GGAACCTACC TGACACATGA GGCAAAAGGC TCCGACGATG 1260 

35 

CTCCAGACGC GGA-CACGGCC ATCATCAATG O.GAAGGGGG GCAGTCAGGA GGGGACGACA 1320 

AGAAGGAATA. TTTCATCTAG AGGCGCCTGC CCACTTCCTG CGCCCCCCAG GGCCCTGTGG 1330 

40 GGACTTGCTG GGGCCGTCAC CAACCCGGAC TTGTACAGAG CAACCGCAGG GGCCGSCCCT 1440 

CCCGMTTGTT CCCCAGCCCA CCCACCCCCT TGTTA.CAGAA TGTTiTKGTTT GGGGTGCGGT 15QQ 

TTTGTWATTC GTTTNGGATN GGGGAAGGGA GGGANGGCGG GG 1542 

45 



(2) IMFORMATIOM FOR SEQ ID MO: 124: 

50 

(X) SEQUENCE CHAPA.CTERISTXCS : , 
(A) LENGTH: 1390 base pairs 
(3) TYPE: nucleic acid 
( C ) STHAMDEDME33 : dcub 1 e 
55 (D) TOPOLOGY: linear 

Ui) SEQUENCE PTXQN : SEQ ID MO: 124: 

CAAGCTCTAA TACGACTCAC TA.TAGGGAAA GCTGGTACGC CTGCA.GGTAC CGGTCCGGAA 60 

60 
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TTCCCGGGTC GACCCACGCG 
CTCGTCATGG TGGCGCCTGT 
CTCTTCCTGA CTCGCAGCCG 
GAGGAGCTGG CAC^AGCAGG 
AGAGCTGGAG GCAGGCCTCG 
CGAGCCCAGC GGGTGGCCTG 
GCCCAGGAGG AGGAAGGTGT 
GCTAAGAAAC TGCGGAANNT 
GAGGCTGAAC GTGARGWGCG 
GGAGGAGGGG CTTCGCCTGG 
GGAGCAGGCC CAGCGGGAGC 
GGAGGAAGGC GTAGGAGAGA 
CATCAACTAC ATCAAGCAGT 
CCTACGCACT C AGG AC AC C A 
AGGTGTGATT GACGACCGGG 
GGCCAACTTC ATCCGACAGC 
CTCCCTCATC GCCTGGGGCC 
CCCTCTTGGA CTCAGAGTTG 
ATCCTGGGGA AGTGATGGTG 
GAGCTTGGTG TGGCTTGGTG 
AATTTAGGCT TCAGnATATA 
CTGTTCTT AT TATGAATCCA 
AAAAACTCGA 



:tc agggtggacg 



:C- C-.CTGAGGCC 



GTGGTACTTG GTAGCGGCGG CTCTGCTAGT 
GGGCCGGGCG GCATCAGCCG GCCAAGAGCC AGTGCACAAT 



STGGCC CAGCCTGGGC CCCTGGAGCC TGAC-GAGCCG 

:ggagg GACCTGGGCA ggcgcctaca g 



IAGCGT 



GGCAGAAGCA GATGAGAACG AGGAGGAAGC TGTCATCCTA 
CGAGAAGCCA GCGGAAAYTC ACCTGTCC-GG gaaaattgga 
GGAGGAGAAA CAAGCGCGAA AGGCCCAGCX TGAGOGAGAG 
GAAACGACTC GAGTCCCAGC GCGAATGAGT GGAAGAAGGA 
AGGAGGAGCA GAAGGAGGAG GAGGAGAGGA AGGCCCGCGA 
ATGAGGAGTA CCTGAAACTG AAGGAGGCCT TTGTGGTGGA 
CCATGACTGA GGAACAGTCC C AG AGCTTCC • TGACAGAGTT 
CCAAGGTTGT GCTCTTGGAA GACCTGGCTT CCGAGGTGGG 
TAAATCGCAT ' CCAGGACCTG CTGGCTGAGG GGP.CTATAAC 
GC-A.GTTCAT CTACATAACC CCAGAGGAAC TGGCCGCGGT 



:-T GTCCATCGCC GAGCTTGCCC AAGCCAGCAA 



GGGAGTCCCC TGCCCAAGCC CCAGCCTGAC CCCAGTCCTT 
GTGTGGCCTA CCTGGCTATA CATCTTCATC CCTCCCCACC 
TGGQCAGGCA GTTATAGATT AAAGGCCTGT GAGTACTGCT 
TGGCAGAAGG CCTGGCCTAG GATCCTAGAT AAGCAGGTGA 
TCCGAGAGGT GGGGAGGGTC CCTTGGAAGC TGGTGAAGTC 
TTCATTCAAG AAAATAGCCT . GTTGCAAAAA AAnAAAAAAA 



120 
130 
240 
300 
360 
420 
480 
540 
600 

S6b 

720 
730 
340 
900 
360 
1020 
-1080 
1140 
1200 
1250 
1320 
1380 
1390 



(2) INFORMATION .FOP. SZQ ID MO: _125 : 

(i) SEQUENCE CI-IA^.CTEF-ISTICS: 

(A) LENGTH: 1293 base pairs 
(3) TYPE: nucleic acid 
(C) STSANDEDNE3S : double 
•CD) TOFOLCGY: linear 

(Xi) SEQUENCE DESCRIPTION: SZQ ID MO: 125: 

GGOGCGCGGG TGAAAGGCGC ATTGATGCAG CCTGCGCCGG CCTCGGAGCG 



:ggasca 



•50 
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10 



15 



20 



25 



35 



40 



45 



50 



55 



60 



GACGCTGACC ACGTTC CTCT CCTCGGTCTC CTO 
CCGGGAGCCA TGCGACCCCA GGGGCCO: 



rrcc 



TC 



3C TGCCCGGCAG 
GCCTCCCCCC AGGGGCTCGG CGGCCTCCTG 



CTGCTCCTGC TGCTGCAGCT 



TCGAGC 



rr CTGAGATCCC CAAGGGGAAG 



CAAAAGGCGC ATCCGGCAGA GGGAGGTGGT GGACCTGTAT AATGGAATGT GCTTA.CAAGG 
GOCAGCAGGA GTGCCTGGTC GAGACGGGAG CCGTGGGCOC AATGGCATTC CC-GGTACACC 
TGGGATCCCA GGTCGGGATG GATTCAAAGG AjGAAAAGGGG GAATGTCTGA GGGAAAGGTT 
TGA.GGAGTCC TGGACACCCA. ACTAGAAGGA GTGTTCATGG - AGTTC ATTGA ATTATGGCAT 
AGA.TCTTGGG AAAATTGCGG AGTGTACATT TACAAAGATG CGTTCAAATA GTGCTCTAAG 
AGTTTTGTTC AGTGGCTCAC TTCGGCTAAA ATGCAGAAAT GGATGCTGTC AGCGTTGGTA 
TTTCACATTC AATGGAGCTG AATGTTCAGG ACCTCTTCCC ATTGAAGCTA TAATTTATTT 
GGACCAAGGA AGCCCTGAAA TGAATTCAAC AATTAA.TA.TT CATCGCACTT GTTCTGTGGA 
AGGACTTTGT GAAGGAATTG GTGCTGGATT AGTGGATGTT GCTATCTGGG TTGGCACTTG 
TTCAGATTAC CCAAAAGGAG ATGCTTCTAC ' TGGATGGAAT TCAGTTTCTC GGATCATTAT 
TGAAGAACTA CCAAAATAAA TGCTTTAATT TTCATTTGCT ACGTCTTTTT TTATTATGCC 
TTGGAATGGT TCACTTAAA.T GACATXTTAA A.TAAGTTTAT GTA.TACATCT GAATGAAAAG 
CAAA.GCTAAA TATGTTTACA GACCAAAGTG TGA.TTTCACA TGTTTTTAAA TCTAGCATTA 
TTCATTTTGC TTCAATCAAA AGTGGTTTCA ATATTTTTTT TAGTTGGTTA GAATACTTTG 
TTCATAGTCA CATTCTCTCA ACCTATAATT TGGGAATATT GTTGTGGTCT TTTGTTTTTT 
CTCTTAGTAT AGCA.TTTTTA AAAAAATATA AAA.GGTACGA ATCTTTGTAG AATTTGTAAA 
TGTTAAGAAT TTTTTTTATA TGTGTTAAAT AAAAATTATT TCCMACAACC TTAAAAAAAA 
AAAAAAAAAA AAAAAAAAAA AAAAANAA. 

(2) IMFORMAT ION FOR 3EQ ID MO: 125 : 

( i ) SEQUENCE CKARACTErLISTICS : 

(A) LENGTH: 1517 base pairs 
(3) TYPE : nucleic acid 

(C) STHANDEDNESS : doubia 

( D ) " TOPOLOGY : i inear 



Ui) 



ID MO: 125: 



AGTGGCTTAA AGGCATCGTT TTAGGGATTA. CTGGGAAGTA TCTTCAAAGT AATACA.TGAG 
AAACATTCCT TCCTAAATCC TTTATTATA.T TGAATATCGT ATTAA.TTGGT TTTCAGAGGT 



120 
180 
240 
300 
360 
420 
430 
540 
500 
660 
720 
730 
340 
900 
960 
1020 

ioao 

1140 
1200 
1250 
1233 



50 
120 
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TAAATTAACG ATGTATTGGT GCAATAAATG TCACTrGTi4T CTTGTATATA ATCTTTTTTA 
TAT ATT AC CG GATTGATTCA TTAGTATTTT GTTGAGGATT TTTGTGTCTA TA.TTCA.TAAG 
AGATGCTGGT GTGCAGTTTT CTTTTTTTGT C-ATAATCTGG TTTTTGTATC PsGTAATACAG 
GCCCCATGAA. ACGAGTTGGG AAGTGTTCAC CTCTCTTGTA TTTTTTCAAG AGTTTGTGAA 
GAATTGCTAT TAATTCTTTA AA7GTTTGGT AGA.ATCTACC ATTG.-AATGA TGTGTCGTGG 
GCTTTTTTTT GAGGGAAGTG TTCTGATAAG TAATTGAGTA TCTACTTTTT ATAGCTCTGT 
TCAGATTTTG CTTCTTCCTG AGTTAGTTTT GGTAATTTGT GTATCTGTAG GAP/TTTGTCC 
ATTTGATTTA TGTGATTTGT TGGCATAAA.T TAAACTAAAT TTGGCCTGAG CCTACCTGTA 
TATCTTGAGT CCGTGTGTAA GGAACTGTAG CCTAACTTGT ACATAAACAA ACTGAAATGG 
TAAATTAGGA ATGTAGTTTT TGTAACAGGT CCTGAGTGTG AGGCAGTCAC AGCAG*/CAAG 
TCTGTGAATT GCAGGGTGGT AACTAAGCAG CCCATGSTCA AATGAGGGAA. AAACCTTTGG 
TTFTAACACA TAGTATAGCT TTGTAATCCT TTTCTTGCAC ACTCGGGTAA TTTCTTCCTT 
tttcattccc KGWATTTTCC AXGAATATGA RTCTYCCTTT TTTCCCCTCC TGTCAGTCTA 
GCTAATGGTT TGTGAATTTT GTTGATCTTT 4 , - TGAAKAACAA ACCTTTGGTT CCACTTTCTT 
GTTGCATATG CTCAFTATTC TCATAATTGG AGTGGAAA.GC TGATCTTTGA TTACTTA.TTT 
TACTTAGGGG TGAGGAGTTC ATGGACTTCG CAAAACCTCC TTGAATCTAA ATTGCATCTT 
CTTTCCTGGT TTCTGGGCTG AAAGATGTTT TTTCCCA.TCT WANAWACCCT TGGTCTTTTG 
ATKGGCGATT AAGACTAGAG AAAGTTCTAG ATHCGTTGTG CTTTTATGCT GTCATTTTGT 
TTAAAGGCTT - TCTATGTAGT AAAACTATCT ATATAGACAA AATAGAGGGT TGAGTTGTGG 
TCTTGAATTT GATGAACATG ATTTACCACA TTCTGTACTG GATATTTCTT CACCTGCTGC 
TACTGTAAAC CATTTTATTG TTGGATCTTG TGT AG AGT AT ATTATCACAG GTACTTTTTA 
C^GGGGTGTC TAATCTTTTG GCTTCCCTGG GGACATTGAA AGAA.GAAGAA TTGTCTTGGG 
CCACA.GATCA AATACGGTAA CACTAATAAT AGTTGATGAG CTAAAAAAAA AAAAAAAAAG 
GCAAAAAAGN CCGAAAA . 

(2) IMF O FMAT X ON FOR SEQ ID NO: 127: 

U) SZQUZXCZ G-£APwACTEHI3TICS ; 

(A) LENGTH: 1073 base pairs 
(3) TYPE: nucleic acid 

(C) STKAMDEDNES3 : double 

(D) TOPOLOGY : linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 127: 
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10 



15 



20 



25 



TGAAXCTATT CTTTGAACAT 

TTCTGCAGTG tgaaatagat 

CA.CAGGCCTT TTTGCAAATG 



33 



AA.TA.TATGTC TCCATCTGGT 
A.-.CZTGGTCC AATAGTCCAA 
AGGAAGTCCA AAAAGCAGAA 
CTGCAGAGCT GTATCTTCAG 
GGTATGGWTC TCCTTACCCT 
AAGTCAAACG 

AGGATGTAGA CCAGTGCTGT 
TCAA.TAAGCA GCCTACTGAA 
CCACACAATT GACAAATGAT 
CTTTCTGTAG GAGAATTGAA 
AGAGTTATGT GTTAGTCTCA 
ATTAGATATT GGTGTCAGAA 
ATAATGTATC TTATGTATGT 
AATTATTTAA TCTGATATGT 
• CATGCATTT A AAAATAAAGC 



TCTAjCAACAA GAATTAjCATT ATACTGTTAT accagagtac so 

TGGTTTGGAA AATGAACCTG GCTTTGCTA.T AAATTA.CATT . 120 

TGTAACTTGC CTATCAAAGT AGTTTGTAGG GCAAATGCAG 130 

AAAGTA.CCTT WTAYTCATGT GGGAAATCAA GTAGTATCAG 240 

TTTGTTAAAG CCAAC-C-GCCA TTCTGTTAGT GATGGGCTGG 300 

ATGAAA.GCTT ACATGGAATT AGTCAACAA.T ATGCTGTTGA- 360 

TGGTGTGATG AAGCTACAGT AGGGRMGATC ACTCATGMTA 420 

TGGGGTCTGW WTCATATTTT GGCCTA.TCAA AAACAGTGGG 430 

GCTATTGGAT GGGGAAAGAA GACTCTGGAC GAGGTCTTAG 540 

CAAjGCTCTCT CTCAAAGACT GGGAACAjCAA CCGTATTTCT . 500 

CTTGA.CGCA.C TGGTATTTGG CCATCTATAC AC Z ATTCTT A 560 

GAACTTTCTG AGAAGGTGAA AAACTATAGC AACCTCCTTG 720 

CAGCACTATT JTTGAAGATCG TGGTAAAGGC AGGCTGTCAT 730 

GGAGTCTTAA CTTTTGAAAT ATGTTTTACT TGAATGTTAC 34Q 

TTTTAAAACC AAATTACTGC TTTTTGAAAC CTCAAATTAT 500 

GCTTTATATT GTTATTTGTG TATACATTAA AATAATTCTG 960 

TGTATTCTGT ATCTTGAAAT TTTTGTTTCC TTGAAACATG 1020 

TTAAACAACT GTAAAAAAAA AAAAAAAAAA CTC 1073 



40 

(2) XNFCR24A.T IOM FOR SZQ ID NO: 123 : 

(i) SEQUENCE CHAPACTEF.ISTICS : 

(A) LENGTH: 300 base pairs 
45 (3) TYPE: nucleic. acid- 

(C) STRAMDEDNESS : double 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 123: 

50 

CAACCCCTGC CTTTTTTTTG TTTTCCATTT GCTTGGTAGA TCTTCCTCCA TCCCTTTATT 60 
TTGAGCCTAT- GTGTGTCTCT GCCCGTGAGA TGAGTCTCCT GAATACAGCA CACTTACTGG 120 
05 TCTTGACTCT GTATCCAATT TGCCAGTCTG TGTCTTTCAT TTGGAGCATT TAGCCCATTT 130 
ACATTTAAGG TKAATATTGT TATGTGTGAA TTTFA.TCYTR TCATTATGWT GTTAGCTGGT 240 
TATTTTGCTT GTTAGTTGAT GCAGTTTCTT CCNGGCATCA ATGGTCTTTA CAA&TFTGGCA 300 : 

60 



WO 93/54963 



331 
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(2) IMFORl^.TIGM FOR SEQ ID NO: 129: 

(i) SEQUENCE OiAHACTERXSTICS ; 

(A) LENGTH: 1275 base pairs 
(3) TT?£: nucleic acid 

(C) STEAMDEDNESS : double 

(D) TOFCLCGY: linear 

(:ci) SEQUENCE DESCRIPTION: SEQ ID NO: 129: 



GGCAGAGCCT GTCCCTGCTG CCCCTGCAAA AAAAACCCCC TCTGGTGTGA GCAGGATGGT '60 

TGGA.GGTTAT GTGAGCTCCT TCTCCTTTCC TCCAGTTTCC TCTTCCCTTC TCCTCCCTGC 120 

CTCTTTTGCT TTTCCCTTTC TTCCTGGTAC CCCCTGCCCA TTCCTGTATT TTCTCCCATC 130 

GCCATTCTCC CCTCTCCCAC TGTCCCTAAC CCGTTCAAAC TCTTTCCTCT TAAATGGTTG 240 ~ 

AGATTTTCTC TCACCAAGCA CACCCCAGTA TTAATTAAAC TAGCTGCAAA CAGGCAGCAA 300 

GTGGTCTACC ATGACAGATG GGTTTTGTGT GTGTGTGTGT GTGTGTAATT GTAATAAAAC 350 

ATATTGARTC ACTCAATAAA CACAGAGTGT, CTACTACATG TA.TCAHGCAC TATCATAGAT 420 

GCTAATTAAC GAAACTGAAA TGGCCAGGCC CTCACAGTGG CTCATGCCTA T AATC CCAGC 430 
ACTTTGGGAG GATGAGGCAG GAGGATCACT TGAGGCCGGG AGTTCAAGAC CAGCCTGGGC . 540 

AACATAGTAA GACTCCATCT CTACAAAAAA AAAATTTTTT TTATTATACT TTAAGTTTTG oOO 
GGTTACATGT GCAGAACGTG TAGTTTTGTT ACA.TAGGTAT ATACGTGCCC TGGTAGTTTG . 660 

CTGCACCCAT CAACCCATCA CCTACATTAG "GTATTTCTCC TAATGTTACC CCTCTCCTAG 720 

CCCCCCACCC CGTGACAGGC CCTGGTGTGT GATGTTCCGC TCCCTGTGTC CATGTGTTCT 730 

CATTGGTCAA CTCTCACCTA TGGAGTGAGA ACATGTGGTA TTTGGTTTTC TGATCTTGTG . 340 

ATAGCTTGCT GAGAATGTKG GTTTCCAGCT TXATCCACGT CCCTGCAAAG GGCATAAACT 900 

CATCCCTTTT TATGGCTGCA TAGTGTTCCA TGGTGTA.TAC GTGCCACATT TTCTTAATCT 960 

ATCATTGATG GACAAGTTTT GCTATTGTGA ATAGTGCCAC AATAAACATA CGTGTGCGTG 1020 

TGTCTTTATA GCAGCATGAT TTATAATCCT TTGGGTATAT • ACCCAGTAAT GGGA.TCACTG 1080 

AGTCAAATGG TATTTCTCGT TCTAGATCCG TAAGGAA.TTG CCACACTGTC TTC CACAA.TG 1140 

TTTGAACTAA TOTACACTCC C^CCAACAGT GTAAAAGTGT TTCTATTTTT • CCACAACCTC 1200 

TCCAACATCT GTTATTTCCT GACTTTTTAA TGAACGTCAT TCTAACTGGC GTGAGATGGT 12S0 

ATCTCATTGT GGTTT 1275 



WO 98/54963 
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332 



(2) INFORMATION FOR SZQ ZD MO: 130: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 472 base pairs 
5 (3) TYPE: nucleic acid 

(C) STFANEEDNE3S : double 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION : SZQ ID NO: 13 0: 

10 

CNGAAACCCC GTGAACCCTC CCCGGGTTAA AAAGCCCCCC CTAAATGGGG GGAACGCYTC 60 

ACACGTTATA AAAAAGCACT AGAATGTTTT GAAAGCGAGA AACAACAGCT GTGTAGGGTA 120 

15 GCTAGCAGTT AGTGTTGTAC AGAAGACAGA TATTTGTGCA TTTYTGCATT TTCTAAGTTT 130 

GCTGCAATGA GCATGTA.TTA CTTTCATAGT - T ATAAAACAC ATGCAAAATG CCCTTTTAAA 240 

ATGAAAAAAA. ATCCATGAGT GTAAGTGATA TATATGCTTT GGAAA.GCCTG GGACGGTCAT 300 

20 ■ 

TGTTTACTCT CAATAGTATG TGTTTGCCTT TGTCTTTTTG AGACATTTTG TTTTAATCTG 360 

TTGATGACAA TAACCTGTTG ATAATATAAC TTGATAACAA ATAAAA.TGAC TTATG ATTGA 420 

25 AWMAAAAAAA AAAAAAAAAA AAAAAAAAAA AAAAAAAAAA AAAAAAAAAA NN 472 



30' (2) INFORMATION FOR SEQ ID MO: 131: 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH : 1950 base pairs 
(3) TYPE : nucleic acid 
35 (C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 131: ' 



40 ACCTCTCAGA ATCTTCTCTC AGCAACCTGA GTCTTCGCCG TTCCTCAGAG CGCCTCAGTG 60 

ACACCCCTGG ATCCTTCCAG TCACCTTCCC TGGAAATTCT GCTGTCCAGC - TGCTCCCTGT 120 

GCCGTGCCTG TMATTCGCTG GTGTATGATG AGGAAATCAT GGCTGGCTGG GCACCTGATG 130 

45 

ACTCTAACCT CAACACAACC TGCCCCTTCT GCGCCTGCCC CTTTNTGCCC GTGCTCAGTG 240 
TCCAGACCNT TGATTCCCGG CCCAGTGTCC CCAGCCCCAA ATCTGCTGGT GCCAGTGGCA .. 300 

50 GCAAAGATGC TCCTGTCCCT GGTGGTCCTG GCCCTGTGCT CAGTGACCGA AGCTCTGCCT 360 

TGCTCTGGAT GAGCCCCAGC TCTGCAACGG GCACATGGGG GGAGCCTCCC GGCGGGTTGA 420 

GAGTGGGGCA TGGGCATACC TGAGCCCCCT GGTGCTGCGT - AAGGAGCTGG AGTCGCTGGT 480 

55 

AGAGAACGAG GGCAGTGAGG TGCTGGCGTT GCCTGAACTG CCCTCTGCCC ACCCCATCAT 540 

CTTCTGGAAC CTTTTGTGGT ATTTCCAACG GCTACGNCTG CCCAGTATTC TACCAGGCCT 500- 

60 ' GGTGCTGGCC TCCTGTGATG GGCCTTCGMA CTCCCAGGCC CCATCTCCTT GGCTAACCCC 660 
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TGATC 



TCTGTTCAGG TACGGCTOCT GTGGGATGTA CTGACCCCTG ACCCCAA2AG 



10 



15 



20 



25 



30 



40 



45 



720 

CTGCCCACCT CTCTA.TGTGC TCTGGAGGCT CCACAGCCAG ATCCCCCAGC GC-GTGGTATG 780 

GCCAGGCCCT GTACCTGCAT CCCTTAGTTT GGGACTGTTG GAGTC-GTGC TGCGCCATGT 840 

TGGACTCAAT GAAGTGCACA AGGCTGTGGG GCTCCTGCTG GAAACTCTAG GKCCCCACC 900 

CACTGGCCTG CACCTGCAGA GGGGAATCTA CCGTGAGATA TTATTCCTGA CAATGGCTGC 960 

TCTGGGCAAG - GACCACGTGG ACATAGTGGC CTTCGATAAG AAGTACAAGT CTGCCTTTAA 1020 

CAAGCTGGCC AGCAGCATGG GCAAGGAGGA. GCTGAGGCAC CGGCGGGCGC AGATGCCCAC ' 1080 

TCCGAAGGCC ATTGACTGCC GAAAATGTTT TGGAGCACCT CCAGAATGCT AGAGACCTTA 1140 

AGCTTCCCTC- TCCAGCCTAG GGTGGGGAAG TGAGGAAGAA GGGATTCTAG AGTTAAACTG 1200 

CTTCCCTGTT GCCTTCATGG AGTTGGGAAC AGGCTGGGAA GGATGCCCAG TCAAAGGGTC 1250 

CAAGCGAGGA CAACAGGAAG AGGGATCCAC TGTTACCAAA AGTCCTGATT CCCCCATCAC 1320 

CAACCTACCC AGTTTGTTCG TGCTGATGTT GGGGGAGATC TGGGGGGAGT TGGTACAGCT 1330 

CTGTTCTTCC CTTGTCCTAT ACCGGGAAGT CCGCTCCAGG GTACCCACAG ATCTGCATTG 1440 . 

CCCTGGTCAT TTTAGAAGTT TTTGTTTTAA AAAACAACTG GAAAGATGCA GAGCT ACTGA . 1500 

GCCXTTGCCC TGAATGGGAG GTAGGGATGT CATTCTCCAC CAATAATGGT CCCTCTTCCC 1560 
TGACGTTGCT GAAGGAGC CC AAGGCTCTCC "ATGCCTTTCT ACCTAAGTGT TTGTATTTTA . 1520 

TTTTAAATTA • TTT ATTCTGG AGCCACAGCC CCCTTGCTTA TGAGGTTCTT ATGGAGAGTG 1530 

AGAAAGGGAA GGGAAATAGG GCACCATGGT CCGGTGGTTT GTAGTTCCTT CAAAGTCAGG 1740 

CACTGGGAGC TAG AGG AGTC TCAAGGTCCC GTTAGGAAGA AC TGGTGCCC CCTCCAGTCC 1300 

TAATTTTTCT TGCCTGCCCC GCCTTGGGGA ATGCCTCAGC CACCCAGGTC CTGACCTGTG 1360 

CAATAAGGAT TGTTCCCTGC CAAGTTTTGT TGGATGTAAA TATAGTAAAA GCTGCTTCTG 1920 

TCTTTTTCAA AAS^AAAAAAA AAAAAAAACT 1950 
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(2) INFORMATION FOR SEQ ID NO: 122: 



(i) SEQUENCE C-IARACTERISTICS : 

(A) LENGTH: 990 case pairs 
(3) TYPE: nucleic acid 
(C) STRAND EDMES S : double 
.(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 13 2: 

TGGAAGATTT A-AATAGGTT TCATATTTCT CTTGAATATG AATATATA-.G CTTGAATAAG 



60 



WO 98/54963 



3S4 

crra-GCccr tattattatg AA-jrrTrccT tattatttct acc-a.tgctt cttatattaa 
,-jGcrrGArc? rrrrcATA— agtatatgta cattagctgc ctgtggatta acaittccat 

X-AA.TGTA77 7TTGCATTGT TTaATCTTAA ACTTTTTGTG TCTTTATATA AGGTATGCTY 

crrrrAAc-cA TGArArrcrr .-aj:cacaata ctttgaaagac aatctycacc ttttacttgt 

ATA nTAJZAT GT^rGTAAT TTTTGATGCA TATTACGTCT TATTATTTAA CCAACCTATT 
7TA7TTCA.TC TAC-3C-CArTT TTC^GAmGC CTTATTTTCT TGTATTAATC AAATATTTT7 
A^CATTGTAT TT7CCY CT AT TA7TTAGKAA TACGKXACYC YAAATATATA TTOTGG STAT 
TTTCAGAA.TT C-CAATATGCC TTTTTAATTT ATTAGAGGCT AAC CT AAATT ATTACTTTTA 
C-TACTTACTT a-AAATTTTG GAACTTTAGA ACATTTATTG TTTTATGCAT TTTAATTCTA 
CTTT-TATTTT TACTACTCCT AAACA7TATT ATTGTTTTAG ACAAGCCAAA ATATATOTTG 
TTATTA.TTTT AT/CTCCATT TCTTTCTGTA TTTTTATGCC ACTATGTATG CTCAATTTGC 
TTCTATGCGA TaA^GCTAAT TC-GTACTTT TGTTTTTTAA TCTGTGCAGG' TAGCCTGGCC 
ATTAAATTTT TATTTTTGGT TTGCTGAAAA AATTGTGTTT ATTTCTATAT GCATACTTAT 
GCATA.TAJGAA T^ICTAGGCKG • ACATATTTTT AGTATTTATA AATGTAAAGT CATTOATTKG 
, GCr?C7A.TCA rrrr-:GTr CGA GA.-A.TC.iA.TT GTCAGCCCAA TAGTTTTTC-. TTTTAAATTA 
CTG-A~TTT TCATGTCT GT G3TTTTAGGA 

(2) o?:c-.rxcji rz?. szq zd no-. 133 : 

(i) SZqUHZ'GZ CM--r A.CTZRI STIC S : 

(Ai u£:iGT>:: 1720 base pairs 
(3) T"i~?E: nucleic acid 
(C; ST?AiC£TMZSS : double 
(G) TOPOLOGY: linear 

(:ci) 32QCS3CZ DESCRIPTION: SEQ ID NO: 133: 

GTCTGACAAG CGACTGTGGT TAXTCCCCTA AAGTTTACTT CAGCACTAAC ACTAGTGCTT 

COGGTGGAGT TTGCAG7TTT CC-GCTTTAT ACAGGATTTT CCTTTGACTG GAAGAGTCAA 

GGATA.TAGAG ACTCAAZAGT GAZATTTATT GTACAACATC AftGGGGAATA GGATACTCAT 

OAACTGGGA TTATTCTTA.T C^eSAACATCG TCTTCTTTGA ATAAGAAAAA TACATAGTTG 

GTTATTAJTGG ACTTAAAACT GTGTTAAATG GATATTCTGA TAAAATATTT GCTGCTCTGT 

AGAGTGTGGA AAATCTGAGA ATATTAGCTT TACTCATCTT GAGCTTTGAG GATGTTCTCT 

GTACGCOGAT GGTTTCATAT TAACTAAAAA AGCTGGGTAT TGTAAAATCT CATTTATAAA 

AACTCAGATG AGAAGAAAAT TTTCTTTGAT GGTGAGACTG TTGTCTTAGT TCAGGAAATT 
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ATTTAATAAT C CTT T G TTAC CTGTGAATGA AGGAACTTTG TAA.TTCTGAT TTATCGTAAA 540 

ACATGAGCCT TTCCAGAGTC ACCTTAGACA CTGTTGTCGC AAATAGCCAT GCTTTGCCTT 500 

ATGCCAAGGA GGCCCA.GAGG GAGGGCCTAG TCTTCCTCTG TTGCTGTACA TATATTGAAA 560 

TGCTTTTTTT TTTTATTTTG CATTTGTTAT CTATAATGAG CTTTGTGAGC G CTGAT ATT A 720 

TGTGAGACAA ACAGGAGTTA TTGATGTTAT ACACTGCGTT CCATTCAGGA TTTTCTGGTT 730 

GGAGGGAAAT ATGTTGACCT TAGAGAATTG TGAATATTGT TGCAATTCTT GAATATATTA 340 

CCATGTGAAT AATAGAGACT GTGTTGCTCT CTAGTATAAG GTATATTTAT TTTTGATTGA 900 

TTTGAATTAC TAGTTATAAC TGGAGAAATT TTGTTACCTC TATCCTGGCT TGGCTGACTG 960 

GCTGTATAAT AGCAGCAGCC TGTTTTAGAG CATCTTAATG AAAACATGGA TGAAAGGAAT 1020 

TAATGATGAT AT GTGCAG AC TGCGTAGAAA ATGGCTTTTG TTCGCAGCGT TAACATTTTC 1030 

TTCTCAATCA GATTTCAATG TTTGTGGAGA GTGGCAGATT CACACCAGAA ACA.CTAGGTG 1140 

TTCATATCGA TAGCATGC-AT GCAGAATAAG CAGTTGGGAG AGAAGCTTGT TCGTACCTGG 1200 

TACTCCTCCC ATTCACCTGA GCCCAGCCCC AGACAGGGGT TAGCATTCAG TGTGGGCCCT 1250 

CAGGCAGCCC TGAAGCGTGG CTGGGTCATG AGATGGGGGC AGCGTGTGAC GGGCACGAGC 1320 

GGCCTG ATTC CAGGGAAGAG TTCGTGGAGG GTGTTGGCTG TTTTTGTTAG CTCAGTTTTT 13 30 

TTCTGGGCTC CACCATTCCT AACTGCAGGT AGACAAGATA GATGT CAC AC ACAACAATTT 1440 
TAAAGTATTT TGGTTAGTGC ATTTTGTTTA TGATTGCAGT GTTTGTTTCT TATTTAATAG . 1500 

GCTTTTTACT TCATTCTATT AAATTTTAGT GTTTAGAAGA GGCGGGTACT ■ GTCACTGTGT 1560 

AAAATATGTA ATATTTTATA TGTTATACCA TGTCATATAT AGTTGCAATA TCAGACCTTG 1520 

CATTCAATAT ACAATGCAAT TGACTCTTTG CAGACCTGCA TTTTTCAGTG AACAATAAAA 1530 

AGATTGTCTG GCACTCCAAA AAAAAAAAAA AAAAAAAAAA 1720 
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(2) INFORMATION FOR SEQ ID NO : 134: 

(i) - SEQUENCE CHARACTERISTICS : 

(A). LENGTH: 705 base pairs 
(3) TT?S : nucleic acid 

(C) STSANDEDNE3S : double 

(D) TOPOLOGY: linear 

<xi) SEQUENCE DESCRIPTION: SEQ ID NO : 134: 
GGCACGAGGC CATCTGGGCT CATTCAGCAG GAAATAATGG AAAAAGCTGC AATATCCAGG 
TGTTTACTAC AATCTGGAGG CAAGATCTTT CCTCAGTATG TGCTGATGTT TGGGTTGCTT 
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GTOGAATCAC AGACACTCCT AjGAGGAGAAT GCTGTTCAAG GAAOGAACG TACTCTTGGA 
TTAAATATAG CACCTTTTAT TAACCAGTTT CAGGTACCTA. TACGTGTA.TT TTTGGACCTA 
TCCTCATTGC CCTGTATACC TTTAAGCAAG CCAGTGGAAC TCTTAAGACT AGATTTAATG 
ACTCCGTATT TGAACACCTC TAACAGAGAA CTAAAGGTAT AGGTTTGTNA AATCTGGGAA 
GACTTGACTG CTAJTTCCATT TTGGGTATCA TATGTACCTT GATGAAGA&3G ATTAGGTTGG 
GATACTTCAA GTGAAGCCTC CCACTGGAAA CAAGCTGCAG TTGTTTTAGA TAATCCCATC 
CAGGTTGAAA TGGGAGAGGA ACTTGTACTC AGCATTCAGC ATCACAAAAG CAA.TGTCAGC 
ATCACAGTAA AGCAATGAAG AGCAGTTTTC CAATGAAAAC TSTGTAAATA GAGCATCAAC 
AAGTACAAAA TTCTTGTCTT AATTAGTGGG GGTATATAAA AATTCGTTGT AATGGTCAAA 
TATTTTTTAA AATTGACATT AATAAAGCAT ATTTTAAAA.G TTTCT 

(2) INFORMATION FOR SEQ ID NO: 135: 

(i) SEQUENCE CHARACTERISTICS: 

(A.) LENGTH: 3 23 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEOMESS: double ' 

(D) TOPOLOGY: linear 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO : 13 S: 

AGCACA.CACC TCCTTTAGTT GCTCCTAAGG TCATGTTCAA CATTCGTGGA GTGCATTTTC 

TGCTCAGGGA GCTXTCCCAG ACCCGGAATG TTTGGTGCTC ACAGACYCTG GCAAGGATCG 

GTATTGCTGT TCCTCAGTTT TGCCTGGGGA AATGGAGGST CAGTGACGTT CAGTGACGTG 

CCCAGAGTCA TGCCATTGGC GGGTGGCCCA GKGMTCCAGG TCTCCAGCAC CCCTCGGCCC 

CCTCCTCACC AGGTCACATC ATCTCCTGGA TTAGAATCTG CTCACATA.GT CTGTCCTGAA 
AGGAAAAAAA AAAAAAAAAA AAC 

(2) INFORMATION FOR SEQ ID NO: 136:. 

(i) SZQUZKCZ CHARACTERISTICS : 

(A) LENGTH: 532 base pairs . 
(3) TYPE: nucleic acid 
. (C) STRANDEDNESS : double 
(D) TOPOLOGY : linear 

(:ci) SEQUENCE DESCRIPTION-: SEQ ID NO: 136: 

GGACGGAA.TG GTGCAACCCT CCTWA^riTTT CTKGKGCTGT TOACAACAGA GGGAGGGAGG 
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CAAAACATTT TTYGTGGGAG AATCCTACYT CTGCAGSGGA GCCCTTAAGC GATKGATTTT 120 

GAATCTKGAC OCTTTACCAA CTAATTTTGA AGGAAGATAC CTTGGAAATA TTTGGCATTC 130 

5 AGTGGGTTAC TGAAACAGCA TTAGTGAATT CATCTAGAGA ACTCTTTCAT TTATTCAGGC 240 

AACAACTGTA CAACTTGGAA ACCTTGTTAC AGTCCAGTTG TGATTTTGGG AARGTATCAA 300 

CTCTACACTG CAAAGCAGAC AATATTAGGC AGCAGTGTGT ACTATTTCTC CATTATGTTA 3 50 

10 

AAGTTTTCAT CTTCAGGTAT CTGAAAGTAC AGAATGCTGA GAGTCATGTT CCTGTCCfcTC 420 

CTTATGAGGC TTTGGAGGCT CAGCTTCCCT CAGTGTTGAT TGATGAGCTT CATGGATTAC 430 - 

15 TCTTGTATAT TGGACACCTA TCTGAACTTG . CCAGTGTT AA TATAGGAGCA TTTGTAAATC 540 

AAAACCAGAT TAAGGTTTGA CTGGTTTCAT TTGATTTTTA AG 532 

20 4 

(2) INFORMATION FOR SEQ ID MO : 137: 

(i) SEQUENCE CHAHACTEP-ISTXCS : 
25 (A) LEMGTH: 1021 base pairs 

(3) TYPE: nucleic acid 
(G) STRAMDEDNE3S :'' double 
(D) TOPOLOGY: linear 

30 (;ci) SEQUENCE DESCRIPTION: SEQ ID NO: 137: 

TTCGGCAGAG CCCTTGCGCG CTCTTGAATA CCTGCKTTCT GTAGCGCTAG TTCTCTTCAA SO 

GATTTGCTTA GTGTCATTTC ATTTCGGTTT CTTTTCTCGC CATGTTTTTC TGTCGGAATT .120 

35 

. ACGGTTCGTT TTGGTTCTAT GTACTCTCTA AAATGTTATC GTTTTTCATT TGTCTACTAA 130 

TTTTCGTGCA TTTGTTACTA ■ CTGAGTTTCT TAATATCTGA CTGGCCTCCG CCCACGGGCT 240 

40 CTGCAGAMCA TAAAATACTC AGGCTGATGG TAGTGCAGAG ACTCTCCCTC CTTGATCAGC 300 

GCAAACGTTG GTCEGAGGCT TGAGGGATGG AGCAACATTT TCTTGGCTGT GTGAAGCGGG 350 

CTTGGGATTC CGCAGAGGTG GCGCCAGAGC CCCAGCCTCC ACCTATTGTG AGTTCAGAAG 420 

45 

ATCGTGGGCC GTGGCCTCTT CCTTTGTATC CAGTACTAGG AGAGTACTCA CTGGACAGCT 430 

GTGATTTGGG ACTGCTTTCC AGCCCTTCCT GGCGGCTGCC CGGAGTCTAC TGGCAAAACG 540 

50 GACTCTCTCC TGGAGTCCAG AGCACCTTGG AACCAAGTAC AGCGAAGCCC ACTGAGTTCA 600 

GTTGGCCC-GG GACACAGAAG CAGCAAGARG CACCCGTAGA AKAF.GTGGGG CAGGCAGARG 660 

AACCCGACAG ACTCAGGCTC CRGCAGCTTC CCTGGAGCAG TCCTCTCCAT OCYTGGGACA 720 

55 

GACAGCAGGA CACCGAGGTC TGTGACAGCG GGTGCCTTTT GGAACGCCGC CATCCTCCTG 780 

CCCTCCAGCC GTGGCGCCAC CTCCCGGGTT TCTCAGACTG CCTGGAGTGG ATTCTTCGCG 340 

60 TTGGTTTTGC CGCGTTCTCT GTACTCTGGG CGTGCTGTTC ACGGATCTC-T GGAGCTAAC-Z 300 
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AGCCTTAGAT AGCAGCAGAA GGCTTTTTGG ATTCTCCTCC TTGAAAAGAT TCTCAGTTAC 960 

CAAACGTCTC C AC CT AGAAA ATAAAAA.TAC ATTAAGATG? TGANAAAAAA AAARAAAAAA 1020 

5 

A IQ2I 



10 

(2) IMFORI1ATICM FOR SEQ ID NO: 133: 

(i) SEQUENCE C^A?AC7ERISTICS : 

(A) LENGTH: 1777 base pairs 
15 (B) TYPE: nucleic acid 

(C) STRANDEDMES3 : double 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID MO: 133: 

20 

CGGAAGATGA TGGCTTCAAC AGATCCATTC ATGAAGTGAT ACTAAAAAAT ATTACTTGGT 60 

ATTCAGAACG AGTTTTAACT GAAATCTCCT TGGGGAGTCT CCTGATCCTG GTGGTAATAA 120 

25 GAACCATTCA ATAjCAACATG ACTAGGACAC GAGACAAGTA CCTTCACACA AATTGTTTGG iao 

CAGCTTTAGC AAATATGTCG GCACAGTTTC STTCTCTCCA TCAGTATGGT GCCCAGAGGA 240 

TCAT CAGTTT A T T T T CT TTG CTGTCTAAAA AACACAACAA AGTTCTGGAA CAAGCCACAC 3QQ 

30 

AGTCCTTGAG AGGTTCGCTG AGTTCTAATG ATGTTCCTCT ACCAGATTAT GCACAASACC 26 0 

TAAATGTCAT TGAAGAAGTG ATTCGAA.TGA .TGTTAGAGAT CATCAACTCC TGCCTGACAA 420- 

35 ATTCCCTTCA CCACAACCCA AACTTGGTAT ACGCCCTGCT TTACAAACGC GATCTCTTTG 430 

AACAATTTCG AACTCATCCT TCATTTCAGG ATATAATGCA AAATATTGAT CTGGTGATCT 540 

C CTTCTTT AG CTCAAGGTTG CTGCAAGCTG GGAGCTGAGC TGTCAGTGGA ACGGGTCCTG 600 

40 

GAAATCATTA AGCAAGGCGT CGTTGCGCTG CCCAAAGACA GACTGAAGAA ATTTCCAGAA 660 

TTGAAATTCA AATATGTGGA AGAGGAGCAG CCCGAGGAGT TTTTTATCCC CTATGTCTGG 720 

45 TCTCTTGTCT ACAACTCAGC AGTCGGCCTG TACTGGAATC CACAGGACAT CCAGCTGTTC 780 

ACCATGGATT CCGACTGAGG GCAGGATGCT CTCCCACCCG GACCCCTCCA GCCAAGCAGC 840 

CCTTCAAGTT CTTTTATTTC TGGGTAACAG AAGTAGACAG ACAGGTTACT TGGTGTATCT 900 

50 

TCTGTTAAAG AGGATTGCAC GAGTGTGTTT TCCTCACACA CTTTGATTTG GAGAATTGGT 960 

GCTAGTTGGC AATAGATAAC TCAGCGTAGA TAGTA.TTGCA AAAAGGGGAG GAAATACACA 1020 

55 ACAATAATAA ATGTAAAAAC CTGCTATTCA ACATGCAGTT TTATTTCGAP: GCCAAAAATC 1080 

TAGAGCTTTC CCAAGATCCT GTTGCCTTAG GCACATNCAC ACTTCAACAG TGCACACTAT 1140 

CCAACAGTGC ACACTATTCA AOjGTGCACA CTATTCAAAA CCGTAGACTA TTTTTTTGCA 12 CO 

60 
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1260 
1320 
1330 
1440 
1500 
1560 
1520 
1630 
1740 
1777 
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(2)" INFORMATION FOR SEQ ID NO: 133: 



(i) SEQUENCE CK.AHACTEaiSTICS : 

(A) LENGTH: 643 base pairs 
(3) TYPE: nucleic acid 

(C) STSANDEDNESS : double 

(D) TOPOLOGY: linear 

(Xi) SEQUENCE DESCRIPTION : SEQ ID NO: 13 9: 
TTTTTTTTTT TTTTTTTTTT TTTTTTTTTT TTTTTTTGGG AATGAGAAAA TAACTTTATT 
TTCATTGTGG GGAGCGGGCC GATGTCCAGC CTCAGAACTT CTGGAACTGC TTCTTGGTGC 
CGGCAGCCTT GGTGACCTTG AGCAC GTTG A AGCGCACTGT CTTGCTCAGA 
CGCCCACTGT GACGATGTCA CCGATCTGGA CGTCCCTGAA GCAGGGGGAC 
ACATGTTCTT GTGGCGCTTC TCGAAGCGGT TGTACTTGCG GATGT AGTGC 
GGCGGATGAC AATGGTCCTC TGCATCTTCA TCTTGGGTCA CCA.CGCCA.GA_ 
CCTCGAATGG ACAC^TTACC AGTGAAGGGG CATTTCTTGT CAATGTAGGT 
AGCCTCCTTG GGGTGTCTTT GAAGCCCAGA CCGATGTTCT TGTTAGTAAC 
TTCTCCTTGC CAGTTTCTCC . CAGCAGGACC CTCTTCTTGT TTTGAAAGAT 
TTTTGGTAGG CACGCTGAGT CTGAATGTCC GCCATCTTCT CGTGCCGMAY TCCTGCAGCC 
CGGGGGATCC ACTAGTTCTA GAGCGGCCGC ACCGCGGTGG AGC 
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(2) INFORMATION FOR SEQ ID NO: 140: 
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{ i) SEQUENCE CTPACTHRI3TICS : 

(A) LENGTH: 1220 case pairs 
(S) TYPE: nucleic acid 

(C) STRANOEDNES3 : double 

(D) TGPOLCGY : linear 

(xi) SEQUENCE DESCRIPTION : SEQ ID MO: 140: 

GGCACGAGGA TGAT AG AC CT ACTGGAGGAA TA.CATGGTTT ACAGGAAGCA TACCTACATR " 60 

AGGCTTGATG GCTCATCCAA GATCTCGGAG AGGCGAGACA TGGTTGCTGA TTTTCAGAAC 120 

AGGAATGACA TCTTTGTGTT CCTGTTAAGC ACACGAGCTG GAGGACTGGG TA.TCAATGTC 130 

15 ACTGCTGMAG ACA-CAGTGCA TTTTCTATC-A TAGCGACTGG AACCCCACTG TGGACCAGCA 240 

GGCCATGGAC AGGGGCCACC GCTTAGGGCA GACAAAGCAG GTTACTGTGT ACCGGCTCAT 300 

CTGTAAAGGC ACCATTGAAG AACGCATTCT GCAAAGAGCC AAGGAGAAGA GTGAGATTCA 360 

20 

■ GCGGATGGTC ATTTCAGGTG GGAACTTCAA ACCAGATACC TTGAAACCCA AAGAGGTGGT 420 

TAGTCTTCTT CTAGACGACG AAGAGTTGGA GAAGAAACGT ATGTACTCTA AACCTCTATA 430 

25 CACTCCCCTC ACGTATCTGA GAATGGAAGA GGT ACTTGG 3 TGTGTGCCAA GGGTTAGGCA 540 

AAGCCAGAGG CTGTATTTAG GGAAAGTATT* TTTGTGCTCA T ATT FT AT AT AAAAACCCAA 600 

ACAAGAATGT GTTTGTAGGC CAGGCGTGGT GGCTCGCGCC TCTAGTCTCA GCATTTCGGG 660 

30 

ARGCCAAAGT GGGCAGATCA CCTGARGTCA GGARTTTGAG TTTGARA.CCA GCCTGGCCMA 720 

CGTTGTGAAA CCCCACCTCT ACTARGARTA C SGAAAATTG GTTGGGCATG GTGGCGGGCA 730 

35 CCTGTAATTC CAGCACTTTG GGAGGCTGGG GCAGAAMAAT TGCTTGAGCC CAGGAGGTGG 840 

AGATTGCGGT GAGCCGAGAT YGTGCCATTG CAMTCGAGCC SGGGCAATAA GAGTGAAAYT 900 

CCATCTTTTA AAAACAAACA AAAAGAAAAA ACACAAGACG GCTCACACCT GTAATCCCAG 960 

CACTTTGGGA RGCCGAHGCA GGTGGATCAC GARGTCAGGA GTTCCAAGAC TAGCCTGGCC 1020 

AACCTGGTGA AGCCCCGTCT CTACTAAAAA TACMAATATT AGTCGGGCGT GGTGGTGGGC 1080 
45 ACGTGTAATC CCAGCTACTC GGGAGGCTGA GGCAGGAGAA TCCCTTGAAG CTAGGAGGCA . 1140 

GAGGTTGCAG TGAGCCAGGA TCGTGCCATT GCACTCCAGC CTGGAGAACA AGAGCAAGAT . 1200 

TCCATCTCAA AAAAAAAAAA 1220 



50 



2>D 



(2) INFOF14ATION FOR SEQ ID NO": 141:. 



<i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 721 case pairs 
(3) TT?E : nucleic acid 
(C) STRAMC'EDMESS : double 
60 (D) TOFCLCGY: linear 
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(xi ) SEQUENCE DESCRIPTION: SEQ ID MO: 141: 
AATTCGGC.-.C GAGCCAGGTT AGCCGGAAG3 GCAGCTCTCC AGGCCCTGCC C-.CCCC--C.-jG 
GGGGCTCCTT ATGC\C-.GCC- GGGCGTCTCC TTGTGGCCAT AGAAACGGAA CTGGCTCTTT 
TCAAOGTGC TGCAAGAGGA TGGTTATTTA ACGCTGGCCC CCAAGGAGGA AAGGCACftGA 
CYTTCCTCCC TCCTGGAACA TCCAAGGGCA CTGGATCCTC TGTGTCCCTC TGAGATGGGG 
TGCCACTCCA GC\AGAGCAC CACGGTGGCA GCTGAGTCCC AGAAGCTTGA AGAAGAGYGC 
GAGGGAAGAG AGCCAGGTCT GGAGACCGGC ACCCAGGCAG CAGACTGCAA C-GATGCCCCG 
CTGAAGGATG GAACCCCTGA GCCAAAGAGC TGAAATGC-CT ' CTCTCCAGAG TCGGACCCTC 
ACCTCrrrCC TGGAACTGCC TTTGGCCCCA GAACCATGAG acaatcccca ccctgagaag 
CTCCCATCAC tgggaggaga c-.gaaa.gcct ccagctttgg gattcaggct tcagaagttt 
ttagcagcct ttgctcattg gagaggtggg gaaaggataa agttcttata aggaaatccc 
taatttcccc cagctcctcc ccncgmgaag aaggaacmaa agaaagttcc ttcc a .cacgt 
tttgttggaa acttttccct tgccaacttt ccttggattg cc\gaacaaa gccctccaga 

A 

(2) INFORMATION FOR SEQ ID MO: 142: 

(i) SEQUENCE CHARACTERISTICS : 

■ (A) LENGTH: 14 S3 base pairs 
(3) TYPE : nucleic acid 

(C) STRAMDEDNESS : double 

(D) TOPOLCCTi': linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID MO: 142: 
ATGAATTAAT GTTTATAAAT GACTGTACTG A-.TTTAAAAC CGTACAGTTT C^TTTGCATT 
TTGACATTAC TTTATT AT AC ATTTTGCATT TAAAAGGCTG C\CCAGTTGG CTTTTCTTCT 
GTTTTATTCT CAAAATATAG AGATTCTGTG ATTTATTTGC ■ CCTGTTTATG GATTAAAAAG 
AAAATTCTAA TAT AAAGCAT - TTCAATAGGA TCCATAGGTA TATTACGTTT TTTAAATGCT 
TTAGATCTGT GATTCTTGAC TTACTATTTA TTTTATCCCC TTTAAGTCAG GGATGCTTTA 
TTCTATTTTA AAGCACTTAT GAGTTACATG TTGTAATCAA GTTTGCACAA TATATTTATC 
TATATGAGGA ACCCATAAAT GAATAGCTAA TTTTTAAAAT GCCATTAAAA TGCATGAAAT 
KCTTATTAAA ACCTTACTAT ACTATTTCTT CAAGGCAAGT AAATTGACCA TGKGRAAAGR 
ACACAGTTAT TAAACACTGT TGACAGGAAA ATTCTCCTTG' ATAACATAGG ACAATTAATG 



WO 98/54963 



PCT/US98/11422 



392 



10 



15 



20 



25 



30 



GAAAAAAAAA TTCTCATTAT TTGCl^AAGAA TGAAjCAAGTT AATGA-.C.-AA CA-ACTACAT SOG 

TTGGTATGTT TTCAGCTTTT GTATCATGTT TAATTGTTTA ATTTGGTTGA AAAACTGCAG ScG 

TTGAGAAATC AGATAjGGAAT ATAGACATTC ACAGCACCTC TGTGGATACC ATGTAATTGT 720 

CAGGTAATTT CAGAAIGTTG AAAATTATTC AGTGCAGCCC TCATAGTATC ATACTTGAAG 730 
AAATTGATTA CAGTTCCACT AAATTGTTGA AGATAAATTA TTTTT A-AGG TTATGAAAAC * 340 

TAAGTTATAT TAATTCATAT GTTTGATTTT TAAATCCCAC CTCCTCAAGC TATCCAATTT 900 

MCTGACTTTG AAAATAAGCA TGAGAGATGC CACATTTGTC TCTGGGAAAC TACCACTCAA 960 

AGAATAATTG TTAAAAATTA ACCTTTTAGG TATTAGAAGC TGTTATAAAG TATAAAATTA 1020 

AGATA.TAAGG AGATCACAT.G TAAATCATTC CTAAAGCACA AGAAAAGAAT GTGGGTTGAT 1030 

GTACATATAT TACTAAGTTG CCTCTCCCAG TTTACTTTAA AAATGGGTTT AAGGATAAAG 1140 

AATAAATGTG ATAGCTGTGC ATGCATTATA TATTTGCATT TGCAAATTTC CCATTGTTTT 1200 

AACAGGTGTG TGGCTGACTT TCAATTTTAA GACGTGAATT GACATACAGC CCATAACTTT 1250 

ATAATGGGTG CTGATTTATG TTATGTTTCA GTTAGTGGAA AAACATTTCA ACCTGACTAA 1320 

AATTTGGAAT TGTGTCTTTT ATGTTCCATG CTCTGTTGTT ACTAGATTTA GTTTAAAAAT 1330 

TGTGTATGAC CATTAATGTA TGTCATAAAC ATGTAAATAA AAGATGTTGA ATCTTGTTGA 1440 

AAAGCAWRAA AAAAAAAAAA AAACTCGA 1463 
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(2) INFOFJyiATIOH FOR SEQ ID MO: 143: 

( i } SEQUENCE ' CKARACTERI ST IG 3 : 

(A) LENGTH: 300 base pairs 
(3) TYPE: nucleic acid 

(C) STRAMDEDME5S : double 

(D) TQFOLCGY : linear 



(xi) 



DESCRIPTION: SEQ ID MO: 143: 



TGAATTTTTT GCCAAACTTA GTAACTCTGT TAAATATTTG GAGaATTTAA AGAACATCCC 60 

AGTTTCAATT CATTTCAAAC TTTTT AAATT TTTTTGTACT ATGTTTGGTT TTATTTTCGT - . 120 

TCTGTTAATC TTTTGTATTC RCTTATGCTC TGGTACATTG AGTACTTTTA TTCCAAAACT . 130 

AGTGGGTTTT CTCTACTGGA AATTTTCAAT AAACGTGTCA TTATTGCTTA CTTTGATT AA 240 

AAAAAAAAAA AAAAAAAAAA AAACCCCMAG GGGGGGGCCG GGTNCCCAAT CCCCCCCAAA 300 



60 



(2) INFORMATION FOR SEQ ID MO.: 144: 
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(i) SEQUENCE CHAEACTERISTICS : 

(A) LENGTH: 2243 base cairs 
(3) TYPE: nucleic acid 
(C) ST3AMD2DNES3 : double 
. (D) TGPGLGGY: linear 
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OPTION: SEQ ID NO: 144: 



TGCCTCCCTT CCTGCAGATT GTGGACAGTA GTTCCTCAGC CTGCACCCTG GATXCCTTCT 
TCCCCTTCCT AGCTCCATGG GACTCGCCCC AAGACTGTGG CTTCAAGGAC CACCAGCCCC 



TTACTCTTCA AGCCCTGACT GTGGAGTTGG T AG ATGCCT G TGATCCTCAG TATTCTCTCT 
GGCAATGTTC CACGGCTTCT CCTTCCTGGG AGCTGGCTCC ATAACTTGAT TTTCCCCAAA 
CGTGTTGCAA TCCCTGCTGC CCCTTAGCCA . CCCftGGGTCT TGTGTGGGTA TGAGTGTAGA 
GGATGGGGGT ATGCCAGGCC TGGGCCGTCC CAjGGCAGGCC CGCTCGA-CCC TGATGCTACT 
CCTATCCACT GCCATGTACG GTGCCCATGC CCCATTGCTG GCACTGTGCC ATGTGGACGG 
CCGAGTGCGC TTYCGGCCCT CCTCAGCCGT GCTGCTGACT GAGCTGACCA AGCTACTGTT 
ATGCGCCTTC TCCCTTCTGG TAGGCTGGCA AGCATGGCCC OVGGGGCCCC CACCCTGGCG 
CCAGGCTGCT CGCTTCGCAC TATCAGCCCT * GCTCTATGGC GCTAACAACA ACCTGGTGAT 
CTATCTTCAG CGTTACATGG ACCCCAGCAC CTACCAGGTG CTGAGTAATC TCAAGATTGG 
AAGCACAGCT GTGCTCTACT GCCTCTGCCT CCGGCACCGC CTCTCTGTGC GTCAGGGGTT 
AGCGCTGCEG CTGCTGATGG CTGCGGGAGC CTGCTATGCA GC-KCGGGGCC TTCAAGTTCC 
CGGGAACACC CTTCCCAGTC CCCCTCCAGC AGCTGCTGCC AGC CCC ATGC CCCTGCATAT 
CACTCCGCTA GGCCTGCTGC TCCECATTCT GTACTGCCTC ATCTCAGGCT TGTCGTCAGT 
GTACACAGAG CTGCTCATGA AGCGACAGNG GCTGCCCCTG GCACTTCAGA ACCTCTTCCT 
CTACACTTTT GGTGTGCTTC TGAATCTAGG TCTGCATGCT GGCGGCGGCT CTGGCCCAGG 
SCTCCTGGAA GGTTTCTCAG GATGGGCAGC ACTCGTGGTG CTGAGCCAGG CACTAAATGG 
ACTGCTCATG TCTGCTGTCA TGAAGCATGG CAGCAGCATC ACACGCCTCT TTGTGGTGTC 
CTGCTCGCTG GTGGTCAACG CCGTGCTCTC AGCAGTCCTG CTACGGCTGC AGCTCACAGC 
CGCCTTCTTC CTGGCCACAT TGCTCATTGG CCVGGCCXTG CGCCTGTACT ATGGCAGCCG 
CTAGTCCCTG ACAACTTCCA CCCTGATTCC GGACCCTGTA GA.TTGGGCGC CACCACCAGA 
TCCCCCTCCC AGGCCTTCCT CCCTCTCCCA TCAGCAGCCG TGTAACAAGT GCCTTGTGAG 
AAAAGCTGGA GAAGTGAGGG CAGCCAGGTT ATTCTCTGGA GGf TGGTGGA TGAAGGGGTA 
CCCCTAGGAG ATGTGAAGTG TGGGTTTGGT TAAGGAAATG CTTACCATCC CCCACCCCCA 
ACCAAGTTCT TCCAGACTAA AGAATTAAGG TAACAT CAAT ACCTAGGCCT GAGAAATAAC 



60 
120 
130 
240 
300 
360 
420 
430 
540 
600 
650 
720 
730 
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900 
960 
1020 
1030 
1140 
1200 
1260 
1320 
1330 
1440 
1500 
1560 
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CCCA.TCCTTG TTGGGCAGCT CCCTGCTTTG TCCTGCA.TGA ACAGA/GTTGA TGAAAGTGGG 

GTGTGGGCAA CAAGTGGCTT TCCTTGCCTA CTTTAGTCAC CCAGCAGAGC CACTGGAGCT. 

GGCTAGTCCA GCCCAGC GAT GGTGCATGAC TCTTCCATAA GGGATC CTCA CCCTTCCACT 

TTCATGCAAG AAGGCCCAGT TGC CAC AG AT TATACAACCA. TT ACC GAAAC CACTCTGACA 

GTCTCCTCCA GTTCGAGGAA TGCCTAGAGA. GATGGTCCGT GCCCTCTCCA CAGTGGTGCT 

CGCCACAGGT AGGGTTTGTT CTGGAAACCC CAGAGAGGGG TGGGGTTGAC TGATGTCAGG 

GAATGTAGGG GCTGGGGGGT GGGTTAAGGC GAC ACTG GTG ACGTGTGTGT TCACGGTGAG 

GGCTGTCTTG AAGGGCGGTA CCCACTCTGA GGCTCCTAGG AGGTACGATG CTTGCCACTG 

TGGGGCCTGC CGGTGCCTAG CAGTCTCCCA GCTCCCAAGA GGGTGGGGAA GCTCTGCACA 

GAGTGAGGTG AGAGGAGGTA CAGGAAAC CT GTAGCTCAAT CAGTGTGTGT vvTAAGTGCAT 

AAGGAATAAG ATGTTAATAA AGTCTTGTAG GGTGTAGGGT GGTTCGTACA ACCACAGGGA 
AAAAAAAAAA AAAAAAACTG GAG" 

(2) INFORMATION FOR SZQ ID NO: 145: 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH : 1032 base pairs 
(3} TYPE: nucleic acid 

(C) STRAND ED MES S : double 

(D) TOPOLOGY- : linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 145: 
GCCAAGGTCT AATACGAGTC ACTATAGGGA AAGCTGGTA.C GCCTGCAGKT ACCGGTTCCG 
GGAATTCCCG GGTCGACGCA CGGGTCCGCT . TGGGTGTGTG AAAATCGTCA CCTCCTTCAT 
AAC CATCTCC CACAATTAAT TCTTGACTAT ATAAATTTAT GGTTTGATAA TATTA.TCAAT 
TTGTAATCAA TTGAGATTTC TTTAGTGCTT GCTTTTCTGT GAGTCAACTG CCCAGACACC 

- TCATTGTACT TGAAAACTGG AACANCTTGG GAATGCCATG GGGTTTGATA ATGTGCGA.GG 
GAJCA.TGAAGA GGGTCAGGTT CGTGGGACCA TGACTTTGGC TCAGCTGATC CTGNACATGG 
GAGAACAACC ACATTTTTCT TTGTGTGTGC TTGTAGCAGC TGTTCGGGAG GACCXTGACC 
CAAYAGTGTT CCCATGCTGT TTCTTGTGAA ATGCTGTCGG CTATGTAGCA GCTTTTGATT 

""CCCTGCAT^fC* l^CTAGGCTGC TGCCCCTATC CTGTGCCfTG^rTTATAACAT TGAGAGGTTT 
TCTAGGGCAC ATACTGAGTG AGAGCAGTGT TGAGAAGTCG GGGAAAA.TGG TGACTACTTT 
TAGAGCAAGG CTGGGCATCA. GCACCTGTCC AGCTCTACTT GTGTGATGTT TCAGGAACTC 
AGCGCGTTTT TCTGCCTAGG ATAAGGAGCT GAAAGATTAA CTTGGATCTY CTAATGGTCC 
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AAATCTTTTG GTCACAATAA AGAGTCTCCA AATTAGAGAC TGCATGTTAG TTCTGGATGG 

ATTTGGTGGC CTGACATGAT ACCCTGCCAG CTGTGAGGGG ACCCCGTTTT TAAGATGCA7 

GGCCAAGCTC TCTGCAAATG GAAATGCTTA CACTGOGTGT TGGGGATGTT TGCTACCTCC 

TGCTATTTTT GTGGTTTTGG TTCTCCCACT ATGGTACGAC CGCTGGCCAG CATCGTGGCT 

IGTCATGTCA GCGCCATTGA CTACCTTCTC ATGCTCTGAG GTACTACTGC CTCTGCAGCA 

CAAA.TTTCTA TTTCTGTCAA TAAAAGGAGA " TGAAAATAAA. AAANAAAAAA AAJOAACTCG 
NG 

(2) X$JFORMA.TZON FOR SEQ ID NO:. 146": 

(i) SEQUENCE C-IAPACTSRISTICS : 

(A) LENGTH: 4313 base pairs 
(3) TYPE: nucleic acid _ 
(G) STHAMDEDNESS : double 
(D) TOPOLCGV: linear 

(xi) SEQUENCE DESCRIPTION SEQ ID MO: 146: 

CAAGCTGCTT TGAAACTAGG GGTCGGGCTC GGCCGTCGTC GTTGTTTGTC GCCSCA.TCCC 

CGCTTCCGGG TTAGGCCGTT CCTGCCCGCC CCCTCCTCTC CTCCCTTCGG AC CCATAGAT 

CTCAGGCTCG GCTCCCCGCC CGCCGCAGCC • C-.CTGTTG AC CCGGCCCGTA CTGCC-GCCCC 

GTGGCCACCA TGTCCCTGCA CGGCAAACGC AAGGAGATCT ACAAGTATGA AGCGCCCTGG 

ACAGTCTACG CGATGAACTG GAGTGTGCGG CCCGATAAGC GCTTTCGCTT GGCZCTGOQC 

AGCTTCGTGG AGGAGTACAA CAACAAGGTT CAGCTTGTTG GTTTAGATGA GGAGAGTTCA 

GAGTTTATTT GCAGAAACAC CTTTGACCAC CCATACCCCA CCAOAAGCT CATGTGGATC 

CCTGACACAA AA.GGCGTCTA TCCAGACCTA CTGGCAACAA GCGGTGACTA TCTGCGTGTG 

TGGAGGGTTG GTGAAACAGA GPCCAGGCTG GAGTGTTTGC TAAACAATAA TAAGAACTCT 

GATTTCTGTG CTCCCCTGAC CTCCTTTGAC TGGAATGAGG 1GGA.TCCTTA TCTTTTAGGT 

ACCTCAAGCA TTGATACGAC ATGCAC CATC TGGGGGCTGG AGACAGGGCA GGTGTTAGGG 

CGAGTGAA.TC TCGTGTCTGG CCACGTGAAG ACCCAGCTGA TCGCCCATGA CAAAGAGGTC 

TATGATATTG C.ATTTAGCCG GGCCGGGGGT GGCAGGGACA TGTTTGCCTC TGTGGGTGCT 

GATGGCTCGG TGGGGATGTT TGACCTCCGC CATCTAGAAC ACAGCACCA.T C\TTXACGAA 

GACCCACAGC ATCACCCACT GCTTCGCCTC TGCTGGAACA AGC AGO AC CC TAA.CTACCTG 

GCCACCATGG CCATCGATGG AATGGAGCTG GTGATTCTAG ATGTCCGGGT TCCTGCACAC 
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CTGTSGCCAG 
CATCCTGCCA 
AAATGCCCCG 
AATGTGCAG7 
AGATAGTCAG 
CTGCCTCTGC 
CATTGCTTTG 
TOGCCCTCTG 
AGATTTTCTC 
GTTTGTCAGG 
AGGTGTCTCT 
TTAGCACTWA 
CAAATTTTAA 
ACTCTCCTGC 
TCAGTGCCTG 
CTGCCTGTGT 
TGACTCATGA 
ACATAGGAAG 
AGCATGGTAG 
ATGAAGCTGA 
AATATAAGCC 
GAGTTGAACT 
TCTTTGGATT 
AACCCTGAAC 
CAAGTGGATA 
AGAAGGAAAA 
TCAACGCCTC 
TCTTTTGACT 
TATGTGTGAT 
GGGAAAGTTG 



GTTAAACAAC CATCGAGCAT GTGTCAATGG 
CATCTGCACT GCAGCGGATG ACGACCAGGG 
AGCCATTGAG GACCCTATCC TGGC-TACAC 
GGGGATCAr-C TGAGGCGGAA YTGTCGGCAT 
AGTGTAGTGT TGGTG3CGGT GTGCCGACGA 
CCCACGGCGA AAGTAAGAAG AAACATGTTT 
GACGCACTGT TACCAGAAGC TGCTCTAGGA 
TGGCAGACTC AGTGCTGTGT GGCGCCTCCT 



TCCTTTCCTC TTCTCCTTTG GTTCGTCAAT 
CGTTGTGTTG AGGAGCAGTT CACGGACTGG 
GTTTGCTGCG CAAiCGYWKX? TTTCATGTGT 
GGTGGGAACA AATAGGAATT TGTCTTTTCT 
CTTTGTATAT TTGTTATCTA TCAGGCTAAT 
TTCATTTCTT ■ TGTCTTATAG* TCCTCGCTGT 
GAGCTGGTAC TGGGGCCCTG GCCGGATGAG 
AGTACATACC TGACCGGGAG TCCAAP-.CCAC 
CACCTTTCTT AGCCTGGCTC CTCTGAAGGG 
CCTCTGTTTA CCCTGAAGCA CCACTGTCCA 
AGCTGAGAGA AACAGGGTCT CAGGGTACCT 
ACTTCAAGGA TATTTCCAGT ACATTCTTTC 
CGAGGGGATT CGACTTAGTG TCTTTTCAAT 
TCGGTCCTTC TGTTGTTTGA GTTTACTGTG 
GAGTGTTCTG AGGTGAGAGA GTCTTCCCGA 
AAGACCTTAG ATGAGAGATG GACTGATGGA 
GATAGTTAAA AAGCATTATA CTGTGGGTAA 
GGAATTATAG ACCCCCAGGG TGAGGCAGTT 
TCTCCCCCAG TTTAGGTTGT GAGCAGTATT 
TGCA3GCCGC AGTGTCTTTC TGTTATGTGA 
TCCACC'GTTA GATGAGCCCT TGGGGCAGGC 
GCTGTTTCCT TGCGCTCTGG TCCTACCCGA 



CATTGGTTGG GGGCCAGATT 1020 

TCTCATCTGG GACATGCAGC 1080 

AGCTGMAAGG WGAGATCAAC 1 1 40 

GTGGTACAAG AAGTGGGTGG I2Q0 
GGC^GGGGCT TTTGTATTTC ■ 1260 

GGAGTGGGeA' GTATGTGTTT 1320 

GTTCCTGGCC AGTGACCGCA 1330 

CAGCGCAGGG CTGASTTTTA 1440 

TAAAAAATGT GTGTATATTT 1500 

CTGTGTCTAT TCCTCTGCCC 1560 

GGTCCATGTG CATGTTGGTG 1520 

GCTAGTATCA GTGTGTTTAA 1530 

TTTTTTATGA AAAGAA.TTTT 1740 

TTGCACCTTC TTCTCTTCGC 1300 

O.GTTTGCCT TGTTG AGT CA 1360 

CTTGGTGCTC TGAAGTCCAG 1920 

CATTCTGGGG TTGTA.-ACAG 1980 

GCCCATTGGT TGCCAGTGGG 2040 

GACTTGAGGG GAATCGTTTC 2100 

AGAGTCTGTT TTTCGATGGA 2150 

GATAGGCAAG AATGATATCT 2220 

CCTGGTGGTA TATTGGGCAT 2230 

GGCATGGTGT CTGTGCTTCC 2340 

CVGCGGCL^.T CCTGGGCT GT 2400 
TGAAAAGGGA GGAnAAAAAA ' 2460 

AAGAGCTGTA CCCACACGTG 2520 

GGACTTGTAG GGTGCAGTTG 2530 

ATGAGTTGCA TGGAGGGGCA 2S40 

AGTTTGGGAT GTGCTCTTGG 2700 

AGTTTTTAAG ' TCCCTGTGAA 2760 
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TTGCTCATCT GAGATT^CTA GAGTAGCAGG CGTGAAGGAT GATGGTTTTG TCCTCTTTGG 2320 

TTCTCACCTG CTTGAGAAGT AAAACAGTAA CTTTGTTCTT CTGGGCCCTT AAGCTTTTTT ' 2380 

GGTTAAGTCT TCCTTTTCAG AAGTAGATGT CATTATATGC CAAAAGTCTA GCTCTTTGCT 2340 

TTACCATACA GGGACCTGTC CCAAAGAAAA AGGCTCTTTT TTTAGCCAGC ATATTTCCCC 3000 
TTCTAGCGTT TTACTTTGTT GTTCTGATTT TAGGACTCTG GCTQGCC\TG TGCTTGTG-GT ' 3060 

TGCCTCTCCT GCATTTGGCA CTGGATTTGG ACTGCATCGT TTGGAGATAC AAAGCGAGCA 3120 

GTTGTTGGTC AGAACCGTCC TCTGCTTTTC ATTGTGTTTG ATAATGGTTA CTGGGTCCTT 3130 

CTCTCAAGGG TAGCAAGGCC APGCIGATGG CTGCTTGTTT AGGAGGGGAT CAGTTCCTTC 3240 

CTGTGGAGAA GGGTGTGAAA TGGAA.GTCAG TGGTAGAAC-G GGGTGGTGTG CTGGGGAGGG 3300 

CTTACATCCA CTGAGTTCTA AGATTCGTTT CCTGATCTGC ACCTACGCGT GGTCTGTATG 3 360 

GTGGAATTTG TCAGGTGGAA GTCAGAAACA ACAAGTTGAA AAAA AAAT AA TAATTAGAAC 342Q 

ATATTTGCAT- AAGATAGGTA TTTACTCTGG AAACGAACAA CTTTTGAGAT. TTCGGTTGGC 3430 

CTGTGGAC'GG CCAGCTCC7G TCATCC1TCC TTAGGTCCTG CAGTACAGTG TTCCCCTGAA 3540 

TGCCACCGOG GACCCAGGGG GACTCCACCC' CCCTAAGCAA GGAGAGA.CA.T ACTCAGAGTT 3500 

GATGAGTTGG TGGTCTTTGA GTGCCAGCTG TCTTACCCTC CCTTTACTCC ACCAGGCCGA 3660 

CGACCCATGA CTGAGGACGG GATTTGTAGA GTCTO\GGAT TTAGAAAGTC TGTAAGGGAT 3720 

CCATGCTGCA GAAP-GCACGG ATCTGTTGTA GTTGCA^AAA CAACTCTGTA ATTTGTTGAG 3730 

GTTCTCAAAC TGACAGCCAG CGAGACTGGG TGGGAGGCGG TCGATCTGTT CTCCCTGACT 3 340 

GGGGGAGGAG CAGGCACTAG GACTTTAjGGA GGAAGCCCAG ATGGAGGCTC CGGCAGGGTG 3 900 

TGGGCCAGGT GGTGATGGCQ CTTTTGGTCG TGGCAGCCTG AGGCAGAGGT GGCTGTATTG 3960 

TCCTCATCTG TTCTGACTGA AGGATGGAGG TGCTGAATAA ATTAGGCCTC AGGOITCTAC ,4020 

CACCAGAGAG CTGGAGAATG GGTCCACGTC ATTCAAGGAC CTGAATTTTT TATGCTCAGG "4080 

AGGATTGGAA TCGTGTTGTT CCAGGGAGGA ATTAGGCTGG AAGGTTAGGA CTTGAAGAGG 4140 

GAAGGTATTT AATAAGTGGG CGAGGATGGG TGTGGTGGCT CACACGTGTA ATGGCAGGAT 4200 

TTTGGGAGGG TGAGGTGGGC AGATG CCAAG GTCAGAAGAT CGAGAGCATG CTGGCTAACA 4250 

TGGTGAAACC CC^TGTCTAC TAAAAATACA AAATTAAATT GGCCGGGCGT GAA m 4313 

(2) IMFORMAT ION FOR SEQ ID NO : 147: 

(i) SEQUENCE CHAPACTEIXSTXCS : 

(A) LENGTH: 113 3 case pairs 
(3) TYPE: nucleic acid 
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(C) STSAMDEDNESS : double 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 147: 

5 

GGCAGAGCCT CAAGCTGA.CT TGGATTATGT GGTCCCTCAA ATCT AG CGAC ACATGCAGGA - 50 

GGAGTTCCGG GGCCGGTTAG AGAGGACCAA ATCTCAGGGT CCCCTGACTG TGGCTGCTTA 120 

10 TCAKWYGGGG AGTGTCTACT CAGCTGCTAT GGTCACAGCC CTCACCCTGT TGGCCTTCCC 130 

ACTTCTGCTG TTGCATGCGG AGCGCATCAG CCTTGTGTTC CTGCTTCTGT TTCTGCAGAG 240 

CTTCCT^CTC CTACATCTGC TTGCTGCTGG GAT AC CCGTC ACGACCCCTG GTCCTTTTAC 3Q0 

15 

TGTGCCATGG CAGGCAGTCT CGGCTTGGGC CCTCATGGCC ACACAGACCT TCTACTCCAC 3 60 

AGGCCAC'GAG CCTGTCTTTC CAGCCATCCA TTGGCATGCA GCCTTCGTGG GATTCCCAGA 420 

20 ■ GGGTCATGGC TCCTGTACTT GGCTGCCTGC TTTGCTAGTG GGAGCCAACA CCTTTGCCTC 430 

CCACCTCCTC TTTGCAGTAG GTTGCCCACT GCTCCTGCTC TGGCCTTTCC TGTGTGAGAG" 540 

TCAAGGGCTG CGGAAGAGAC AGCAGCCCCC AGGGAA.TGAA. GCTGATGCCA GAGTCAGACC • 5 CO, 

25 

CGA.GGAGGAA GAGGA.GCCAC TGATGGAGAT GCGGCTCCGG GATGCGCCTC AGCACTTCTA. 660 

TGCAGCACTG CTGCAGCTGG GCCTCAAGTA CCTCTTTATC CTTGGTATTC AGATTCTGGC 720 

30 CTGTGCCTTG GCAGCCTCCA TCCTTCGCAG GCA.TCTCATG GTCTGGAAAG TGTTTGCCCC 750 

TAAGTTCATA TTTGAGGCTG TGGGCTTCAT TGTGAGCAGC GTGGGACTTC TCCTGGGCAT 340 

agctttggtg atgagagtgg atggtgctgt gagctcctgg ttcaggcagc tatttctggc 900 

3d 

CCAGCAGAGG TAGCCTAGTC TGTGATTACT GGCACTTGGC TA.CAGAGAGT GCTGGAGAAC 960 

AGTGTAGCCT GGCCTGTACA GGTACTGGAT GATCTGCAAG ACAGGCTCAG CCATACTCTT 1020 

40 ' " ACTATCATGC AJ3CCAC-GGGC CGCTGACATG TANGACTTCA TT A.TT CWATR ATTCAGGACC 1080 

ACAGTGGA.GT ATGATCCCTA ACTCCTGATT TGGATGCATC TGAGGGAGAA. GGGGGKCGGT 1140 

STCCGAAGTG GAATAAAATA GGCGGGCGTG GTGACTTGCA CCT 1133 

45 



<2) INFORMA.T ION rOR SEQ ID NO: 143: 

50 

(i) SEQUENCE CHAPACTERI ST ICS : 

(A) LENGTH: 73 4 'case pairs 
(3) TYPE: nucleic acid 
(C) STRAMDEDNES3 : dcufale 
55 - : (D) TOPOLOGY : linear 

(:ci) SEQUENCE DESCRIPTION: SEQ ID NO: 143: 

GAATTCGGCA. GAG^GAAGCA TTAGAATGAT TCCAACACTG CTCTTCTGCA CCATGAGACC 60 

60 
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AACCCAGGGC AAGATCCCAT CCCATCACAT CAGCCTACCT CCCTCCTGGC TGCTGGCCAK 120 

GATGTCGCCA GCATTACCTT CC\CTGCCTT TCTCCCTGC-G AAGCAGCACA GCTGAGACTG 13 0 

5 GCCACCAGGC CACCTCTGTT GGGACCCACA CCAAAGAGTG TGGCAGCAAG TGCrfTGGCTG 240 

ACCTTTCTAT CTTGTGTAGG CTCAGGTACT GCTCCTCCAT GCCCATGGYT GGGCCGTGGG 300 

GAGA.iGA.AGC TCTCATACGC CTTCCCACTC CCTCTGGTTT ATAGGAGTTG ACTCCCTAGC 360 

10 

CAACAGGAGA GGAGGCCTCC TGGGGTTTCC CCRRGGCAGT AGGTCAAACG AC CTGATCAC 420 

AGTCTTCCTT CCTCTTCAAG CGTTTCATGT TGAACACAGC TCTCTCCRCT CCCTTGTGAT 430 

1 5 TTCTGAGGGT CACCACTGCC ARCCTCAGGC AACATAGAGA GCCTC CTGTT CTTTCTATGC 540 
TTGGTCTGAC TGAGCCTAAA GTTGAGAAAA TGGGTGCCAA GGCCAGTGCC AGTGTCTTGG . 600 

GGCCCCTTTG GCTCTCCCTC ACTCTCTGAG GCTCCAGCTG GTC CTGGG AC ATGCAGCCAG 660 

20 

GACTGTGAGT CTGGGCASGT CCftAGGCCTG CACCTTCAAG AAGTGGAATA AATGTGGCCT 720 

TTGCTTCTAT TTAA 734 

25 



(2) INFORMATION FOR SEQ ID MO: 149: 

30 (i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 1405 base pairs 
(3-) TYPE : nucleic acid 
(C) STPANDEDNE33 : double 
<D) TOPOLOGY: linear 

35 

Co.) SEQUENCE DESCRIPTION: SEQ ID NO: 149: 
GGCACAGTGG ACCCCAGACT CCCTCTCCGC CTTTCT-CTGC CTGGGGAGAC CCACTGTGTG 60 
40 CATGGCATCA CTGACTCCCA TACCTCTGGC TATCAAAGGT TTCTGCCATG GCCACCCTGG 120 
AAGSAAACCA GAGGGAGGTA GACAGGGAGA TCAGGTCCCT TCTACTCTGG TTCCTGCTCT 130 
GTGAAATTGT CTCAGGCTGG CTGTGTCCAG ARGGTCCCTG GTTCTCTCAR GGATGCCAAA 240 

45 

TCTACAAGAA TCTCTCCTCT TCCAGTTCCT ATAACCTCTC CTTCCTTTTG TCTCTTTAGA '300 
CCTTGGAGTA GTAGCAGCCA GGTTCTTTCT ATCTCTGGGT TAGTGCATTA TCTCTGGTGG 360 
50 CTCCCTTACC CAGGACTTTG GGAATGGTCT TTTTGTAATA CATTCTCCTC AAATAATTCA 420 
ATTTTG AGTG TTCTGTATGT ATCCTGCTGG GAGGTTGTTA TATACAAATC ACTGTGCCCG 430 
TTTAGCAGAG AAGGAGACTG AAGCTCAGGG AGGTTAAGTG TCTTTCTCTA GGTCGTATTG 540 

55 

TGGAGAAAGT GGCTGACTGG GGACTTGAAT GAGGTC CCT A GTTTCATGCT CGGAGGGCAA 500 
AGANGAATGT CCAATTGGCC TGAGATAAGC CTCTGGTAAA ATGTACTGTA CATAATAGGT 660 
60 AATCAATAAA TGTTGGCTGA TGACAAACAT GTTTTCTTTG TTCATTAGTT ATAGTGATTA 720* 
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TGTTCTAAAT AACTCCMACA. AGGAAF.TCAG CACATTTGGA ATATCAOTAT CTTTCCATGA 730 

TAATATCTTT CCLTiGGAAAG AWAATGATAT TCOIAACTGG GAGTGTCCCW AGCA3A.TCTG 340 

5 

ANTCTGTGTA TTGGCCCTGG GGTGGGCCAG CCCCTTAGAC TCTATGGTCT CATTCTCTTT 900 

GTTTACAAAA TTGAGATAAG GCCTTATTCT CTCCCCACCC CACCCATCCA TATTGTTTTG 960 

10 AGAATAAAAT GAGAGGATG7 GTGTCAAGGG TGTATTTTGG CAATA.GTCTC TGAGCCATTT 1020 

TCTGAGCACC TGCATACTGT TC-ACACTCAA GTAATATTTC ATCAGCA.TTC CA.TTCAGGOT L080 

CCTGGCTTAA. TGAGGTGTCG GATGTACAAG AGTYGTGAGG TGGCAAAGGA. TGGGCTCCTG 1140 

15 

AGGAAACACT TAGGAAA.CTG GGCTTTCTGC CATTAAAAGA GACAAACCTT TGTGGTGACC 1200 

TAATTAAAGT TTTTAAAATT CAATTTGGAA AGTTAGCAAG CTAGCTCCTK TCCAGGWAAA 1250 

20 ATAAGGAGTC AGTGCATGAC CTAAGCGGTC CCGGGCTGCT TGCCATTCCA AA.CAACTGCA .1320 

GTAAGTTTAT CACNTTCTTT CAGGGAGTGA GGTTTCCAGG CACAGACTTG GATAA.GGAAG 1380 

GATGTCCTAT GGGGTCACAT TGATG 1405 

25 



(2) INFORMATION FOR SEQ ID MO: 150: 

( i ) SEQUENCE -CHARACTERISTICS : 

(A) LENGTH: 2S90 base pairs 
(S) TYPE: nucleic acid 
- (C) STRANDEDNESS : ccubla 
35 (D) TGFQLCGY: linear 

(Xi) SEQUENCE DESCRIPTION: SEQ ID MO : 150: 

TTATATGCTA CAGCTACAGT AATTTCTTCT CCAAGCA.CAG AGGANCTTTC C CAGGATCAG 60 

40 

GGGGATCGCG CGTCACTTGA TGCTGCTGAC AGTGGTCGTG GGAGCTGGAC GTCATGCTCA 120 

AGTGGCTCCC ATGATAATAT ACAGACGATC CAGCACCAGA GAAGCTGGGA GACTCTTCCA 130 

45 TTCGGGCATA CTCACTTTGA TTATTCAGGG . GATCCTGCAG GTTTATGGGC ATCAAGCAGC 240 

CATATGGACC AAATTATGTT' TTCTGATCAT JSJGCACAAAGT ATAACAGGCA AAA.TCAAAGT 300 

AGAGAGAGCC TTGAACAAGC CCAGTCCCGA GCAAGCTGGG CGTCTTCCAC AGGTTACTGG 3 50 

50 

GGAGAAGACT CAGAAGGTGA CACAGGCACA ATAAAGCGGA GGGGTGGAAA GGATGTTTCC 420 

ATTGAAGCCG AAAGCAGTAG CCTAACGTCT GTGACTACGG AAGAAACCAA GCCTGTCCCC 430 

55 ATGCCTGCCC ACATAGCTGT GGCATCAAGT ACTACAAAGG GGCTCATTGC ACGAAAGGAG 540 

GGCAGGTATC GAGAGCCCCC GCCCACCCCT CCCGGCTACA TTGGAA.TTCC CA.TTACTGAC oOQ 

TTTCCAGAAG GGCACTCCCA TCCAGC CAGG AAACCGG CGG ACTACAACGT GGCCCTTCAG ooO 

60 
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AGA.TCGGGGA TGGTCGCACC- ATCCTCCGAC ACAGCTGGGC CTTCATGCGT ACAGCAGCGA 720 

CATGGGCATC GGACCAGGAG GAGGGCTGTG AACAAACCTC AGTGGGATAA AYCGAACGAG 730 

5 TGTGACGCGC GCCTCGCCCC YTATCAGTCC CAAGGGTTTT GGAGGGAGGA GGATGAAGAT 340 

GAACAAGTTT CTGCTGTTTC- AGGCACAGAC TTTTCTGGAA GCAGAGCGAG C C AGGTGAAA 900 

GGAGAGCACA AGAAGACGTC GTGAGCA.TTG GACGCTTGGA. ACTCACATTG TGAGGACGGT 960 

10 

GGAGCAGTTT GCCTCCTTCC CTGGCTTAAA AGCAGCATGG GG3TTCTTCT CCCCTTCTTC 1020 

CTTTCCCCTT TGGA.TGTGAA ATACTGTGAA GAAATTGCCC TGGCACTTTT CAGAGTTTGT 1030 

15 TCCTTGAAAT GCACAGTGCA GCAA.TGTTGG AGCTGCCAGT GTTGCTGCCT GGGAGATGAG 1140 

ACAGTATGAT TGCAAATTGC AAGATGA.TGA CAACAAGATG ATTGACTGTG GGTGGAGTTG 1200 

TCAATGGGTG GAAGGATTTT TTTTAATCTT GGTTTTAGAT TTCAATCGAG TCCTAGCACT 1250 

20 

TGATGTGATT GGGATAA.TGA GAAAAGGTAG CCATTGAACT AGTTCGGGGG TTTAACGGAC ' 1320 
CAAGGAAGAC AAAjGAAAAAC AATGAAATGC TTTGAGTACA GTGCTTGTCG AGTTGTTTAC " -1330 

25 AA.TGTCGTCC TTTTAAAAAA AAAAAAA.TGA GTTTAAAGAT TTTGTTCAGA GAGTAAA.TAT 1440 

ATATCGA.TTT AATGATTACA GTA.TTATTTT ' AAAGCTTAAG TAGGGTTC-CC AGGCTGGTTT 1500 

CTGAAAAACC AAATATGCCG GA.CAGGGTGT GGGCACACGA AGAAGACGGG AAGACGTGGC 1560 

30 

TTGTGAGCGT GGCTTCGCAT GTCCTTCTGG TCTCACGCGG GAAGTGGGCT ATCCTGGAAG 1520 

TATGAAATGT TAGGCAATTA ATACGAAGAC ACGTCATCTG CTCGTTCCCC AGTGGATGGG 1630 

35 GTTCTTGTGT AAAACTGTTT GCACATGGGG AGGGGAGGGA ACTAGGACGC TTGTGTGGTG 17 40 

TCTGAGCCTT ATGGAGGCAG GAGGGTGTCA TTGGGGGATG TGTCCTGCTC GA.TTGAGATG 1300 

. GATGGGAAAG CCGA.TTTTTA AGTTATATTT CTTTGATTTT TGTTAATTTA GAGGTGTAGG 1360 

40 

TTTTGTTTTT TGTTTTTTTG TTTTTTTTTA AGAGAAAGA.T TTATAACTG3 AT AGGATTGG 1520 
AGTGAAAGGA. GCTTGGGATG TTGGAGCTAA. TGGCAGCTGT TTATACTGCT CTTTCAAGAG . 1930 

45 AGCCTCCCTT TATTGAATTG GCATTAGGGA ATAAAGAAGG GTTTAAACGT GATAAAAGA.T 2040 

CAAAAACGTG GTTAGAGATG CGAGC CTTTG CAAGGCAGGT TAGTCACGAA AGACTAACCT 2100 
CGAAGTGGGT TTLATGGACGG TGGATATAGA GAAGGCGTAA GTGTAGGAAC CATCTGCTCA ' 2150 

50 

GAGCTGCTA? TAACCGTATA ATGAGTGAAA TGACGC GTGC ACTCTATTTT TGTGTTGTTT 2220 

TGGACAGAGT GGGGAAAA GT GAAGGGTGCG AATGTGAGTA GTACTGAAAT GTGAGGAAGT 22S0 

55 GCTC-GTCTTG GATTTTTTTT GGATTAAATT GAGCTO AT CA TATTGATCAC TAGATAAACG 25 40 

TAAATAGGTT CAAATTTTAA AAGTGGAATT GCAGTGTTTT TTGAGTGTAT CAAAGAATGT 2400 

CAGTGCTTTA TTTAATAATT CTCTTGTGTA TCATGGCATT TGTCTACTTG CTTATTACA.T 243-0 
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TGTCAATTA.T GCA.TTTGTAA TTTTACATGT AATATGCATT ATTTGC CAGT TTTATTATAT 2520 

AGGCT.ATGGA CCTCATGTGC ATATAGAAAG ACAGAAATCT AGCTCTACCA CAAGTTGCAC 2530 

AAATGTTATC TAAG2ATTAA. GTAA.TTGTAG AAGA-TAGGAC TGCTAATCTC AGTTCGCTCT 2540 

GTGATGTCAA -GTGCAGAATG TACAATTAAC TGGTGA.TTTC CTCATACTTT TGATA.CTACT 2700 
TGTACCTGTA. TGTCTTTTAG AAAGACATTG GTGGAGTCTG TATCC CTTTT GTATTTTTAA ' 2780 

TACAATAATT GTA.CATATTG GTTATATTTT TGTTGAAGAT GGTAGAAATG . TACTATGTTT 2320 

ATGCTTCTAC ATCCAGTTTG TACAAGCTGG AAAATAAATA. AATATAA.CAT AAAAAAAAAA. 2330 

AAAAAAAAAA 2390 



20 (2) 



25 



30 



35 



40 



45 



50 



55 



60 



ID MO: 151: 



' (1) SEQUENCE CHARACTERISTICS : 

(A) LENGTH : 23 99 base pairs 
(3) TYPE: nucleic acid 

(D) TO FOLCGY : 1 iaear 



(:ci) SEQUENCE DESCRIPTION: SEQ ID NO: 131: 

GAA.CTTTTCC ATCTGGCAAA CCGGAAACTC CATCCCCATT AAACCAA.CTC CCCCTTTTGG 60 

TTTCCCCCCC AGNGGAATAG AATTTGGACN CC CAT ATAAA TCCAGGAAAC CA.CCTAAATT 120 
CTTTAGTMGT TTGTGTTTGC AAGATCTAAG GTCATGGTAA. ACATTAAGTT ' CTTAAAATTT - 130 

TTGGGAGGGA CCAGTGCACC TCTCCCTCTG AATTGTTCNC CAATTTAAAA. TTGGAGTAAG 240 

GTTTTAAAAT GTCTNATTCC ATTGGAAGGG TOTGTTATTT CATTTTGAGC CCAGAGGGGA. 3 00 
GAGGCACATT TTAAA.TATCA GAA.TTAGATT AGCTTTGAGT TTGTACAATT GGGAACATAA. . 3 50 

TAGATTTTCA TAAATTA.TGT GTGC CTTGT T GGAAGTGTCA ACTGTCTTTA TGTCTGCTTG 420 

TAAAAGTTTC AAAATA.TGTT TTCCCTCAAA AAGGCAACGT TACTTCATTT GCTTGAATAT 430 

TATGATAGGA ATGCTTACTG ATATTACTTG ATAGTCATAT ATAGCCTA.GG AAATTTAACA 540 
TATA.TATAAC TATAGCAGTA TTAATAA.TGA TAGTTGTA.CT TCTTTAAAAC ATTAAATTTG ' 600 

AGGAAACTTT AATGCTGTCT CGTGTACATT GCTTTACTAC AGTGAGGGGG AATATCCTTT 560 

AGATTGAGCC TCAATTTACT GGTTAGTAGT ATGTGAA.CTC TGGTATAAAA ACGTAAACTA. 720 

GACAGTAGAG CCGATGAATT AAAATTGTAA ATTGCTACAT TGGCA.TTTTC TACCTCCTTT .730 

TCTGTCAGAG TATTACTTTT TOCAGCATTT ATTCTTATTT GTGAGTAAAG AGGAAATGGG 340 

AACCTGAGGT TAAAATTGAC ATTTTTGTTT CATTGAGAAT TTAAGCAGTA GGTACAGGAG 900 

AAGTGA.CTTG TCACATTAA.T TTGGTGCCTA AATCTGTAAC TACAA.GTTGT GATCGACATG 360 
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TACAAAATGT CTAACAAAGG TCATATGCTG AATATTTTAC TTTTCCTGTA TAGTCTGCAT 1020 



GATTTGTTTC ATAAACCGAG CTTATTTCCT CGAAAAAGCA AAATGGTCCT GTAA.TTTTTA 1030 

5 

AAGTAAAATA AACGTGGGAT TTTGTCTGCA ATCTATAATT TCAGGAA.GTT ATTGRAAGTT 1140' 



CTGA.CTCAGG GCTTTTTAAC AGTTCAAGCA ATTGTCAGTT ATATTTTGGA AACTCCATCT 1200 



10 GTGTAATTCT CCAGTGCCTT GAAAGAATTA TTAACTTGGC AACAC T ATTA AAACTTTATA 1260 



AAAGATGGTC TTTAGTGCAC GTGTATCATT ATATACACGT TTTAAAGTCA TATTGCTTAG 1320 



CTTGTTAATA ATGATTCTGG ATGTGTGCTG GGTTTGGGTA ATTCTTTAAA GCAAGTTTTC 13 30 

15 

- TAGATTTGGA CTTGATGTTT GTTTTTTAAA AACTGATTA.T TT ATGGCCGT GACACTGTTA. 1440 



CCA-GAAAAGT AATTCTAA.TT AAGTTATTA.T GCAAAGTCAT CTATAAGTAG CATCTGGGAA 15 QQ 



20 GAGGAGATSG AGGCCACAGT TTGGTATTTT AGTATGAAAG GAGGA.T CTGT TTGGGAAACA 1350 



TAGATTGTCT TCCCCTCAAA TGAGGGGAAA AAAAAAGACC CTTTGTTCAA ATGGATTCTG 1620 



TTGTAAAAAA TTATTTTTAA AGGAAATCAC AAATTGTATG TCATTCTTAA TGCTAGTCTT ' 1530 
ATAGAATAAA TCCATAAAAT TGTTTTTATG , TTCAGTATGT TTATGTCATT CTAAA.TGCA.G 1740 



CAAATTCAAT GATAGCAGTT CAATTGACTC ATAGGAGTGT TTTGTATTTT TTCTAATTCT 1300 



30 ' TTAGCTTTCA ATATTGGATT AAAGTCTTGT TTGTGAA.TAT AGTTTCCGTA TGGCAAATGA 1360 



TTTCTTGCTT • A.TT AGCTTTT GTTAAAGAAT GCTTAGTAAG AGCTAAGCTT TTAAAAGTAA 1920 



TGCAAACATT TATCGTTAAT AAAACCTATG GTGTAATATC ATATAATGCT TTTCTTTGAT 198 0 

35 

CTTTGGAGAA TTATTCTTTT ATA.GTAGTAT ACATGAATTT TGATTTTTAA AGCATTTAAA 2040 



AACAAATCTG AATACATTAA AAAACCTGTT ATTGTTAAAA. RGGAAATTAC CATGCCTTTA 2100 



40 AGAAACAAOG ATGTACATCT TCAATTCAGC ATKAGTGTCC ACATCTAGAA GGCTCTCATT 2160 



GCAGTTGTTT ACAGTTAAGG TACCTCTATC TAAAGGGCCA AAjGAAGCATT TCATAiTTTA 2220 



ACACCTCACA TTCTTTCAGG ATTAAGACAT ATGAAAATAG TCTGAATAGG ATAAATTTGG 2230 

45 

ATAGGAAGTA ACTTAACCAG TCTGGGAAGA TTCAGGCTTT TTCTATXAAA AAGCTTATTG 2340 . 



CTCTTCACAA CTCNGGTGGT AGGWTTTCAT TTTTCAAGAG GGTAGATA.TT TTAAAGGGA ' 2399 



50 



(2) INFORMATION FOR SEQ ID MO : 152: 



55 (i) SEQUZXCZ aiA?ACTE?-ESTXCS: 

(A) LENGTH : 802 base pairs 
(3)- TYPE: nucleic acid 

(C) STRANDEDNESS : double ■ 

(D) TOPOLOGY: linear 

60 
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<xi> SEQUENCE DESCRIPTION : SEQ ID MO: IS 2 : 
CGTGCCTGTA GTAAGCTCAT CCCTGCCTTT GAGATGGTGA TGCGTGCCAA C<IACAATGTT 
TACCACCTGG ACTGCTTTGC ATGTCAGCTT TGTAATCAGA GATTNTGTGT TGGAGACAAA 
TTTTTCCTAA AGAATAAOvT GAYCCTTTGC CAHACGGACT ACGAGGAAGG TTTAATGAAA 
GAAGGTTATG CAjCCCCMGGT TCGCTGATCT ATCAACATCA CCCCMTAAG A--TACAAAGC 
ACTACATTCT TTTATCTTTT TTGCTCCACA TGTACATAAG AATTGACACA GGA.-JCCTACT 
GAATAGCGTA GATATAGGAA GGCAGGATGG TTATATGGAA TAAAAGGCGG ACTGCATCTG 
TATGTAGTGA AftTTOCCCCA GTTCAGAGTT GAATGTTTAT TATTAAAGAA AAAAGTAATG 
TACATATGGC TGGATTTTTT TGCTTGCTAT TCGTTTTTGT GTCACTTGGC ATGAGATGTT 
TATTTTGGAC TATTGTATAT AATGTATTGT AATATTTGAA GCACAAATGT AATACAGTTT 
TATTGTGTTA CCATTTGTGT TCCATTTGCT YCTTTGTATT GTTGCATTTA GTACAATCAG 
TGTTTAAACT TACTGTATAT TTATGCTTTC TGTATTTACC AGCTATTTTA AATGAGCTGT 
AfcCTTTCTAG TAAAGAATTG AAAAGCAAAT CCTCACTAAA GGATACACAG GATAGGATAA 
AGCCAAGTCM CATCAACATT AAAAAATACT AA^NA&IAAA ACACAAAAAA AAAAAANCCC 
GGGGGGGGCC CGGAACCCAT TC 

(2) INFORMATION FOR SEQ ID NO: 153: 

• (i) SEQUENCE CKAR.ACTERISTICS : 

{A) LENGTH: » 461 base pairs 
(3) TYPE: nucleic acid 

(D) ' TOPOLOGY: linear 

(:ci) SEQUENCE DESCRIPTION: SEQ ID NO: 152: 

CTAGGAGCAC CGAGCAGCTT GGCTAAAAGT AAGGGTGTCG TGCTGATGGC CCTGTGCGCA 

CTGACCCGCG CTCTGCNCTC TCTGAACCTG GCGCCCCCGA CCGTCGCCGC CCCTGCCCCG 

AGTCTGTTCC CCGCCGCCCA GATGATGAAC AATGGC CTCC TCCAACAGCC CTCTGCCTTG ' 

ATGTTGCTCC CCTGCCGCCC AGTT CTT ACT TCTGTGGCCC TTAATGCCAA CTTTGTGTCC 

TGGAAGAGTC GTACCAAGTA CACCATTACA CCAGTGAAGA TGAGGAAGTC TGGGGGCCGA 

GACCACACAG GTGGGAACAA GGACAGGGGG ATTT AAGCAG TCAAAAGGAA AAACATGTTA 

AGACCCTAGA CTTGTATATT GACACACTTG TACCTTGTAA GGCAGAGGAA TGTAATTAAA 

AAGCACTTAT TTGGCvvNAAA AAAftAAAAAA AAAAftAAAAA C 
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(2) X^IFOEMASION FOR SEQ ID- NO: 154: 



(i) SEQUENCE CKftPACTSSaSTICS : 

(A) LENGTH : 2233 base pairs 
(3) TYPE: nucleic acid 

(C) STRANDEDNE3S : double 

(D) TOPOLOGY: linear 



(:ci) SEQUENCE DESCRIPTION: SEQ ID NO: 154: 

GCCCACGCGT CCGAAAGCGG AGAACGCTGG TGGGCCTGTT GTGGAGTACG CTTTGGACTG SO 

15 AGAAGCATCG ACGCTATACG ACGCAGCTGT TGCCATGACG GCCCAGGGGG GCTCGTGGC7 120 

AACCGAGGCC GGCGCTTCAA GTGGGCCATT GAGCTAAGCG GGCCTGGAGG AGGCAGCAGG ISO 

CGTCGAAGTG ACCGGGGCAG TGGCCAGGGA GACTCGCTCT AC CCAGTCGG TTACTTCGAC 240 

AAGCAAGTGC CTGATACCAG OGTGCAAGAG ACAGACCGGA ' TCCTGGTGGA GAAGCGCTGC 300 

TGGGACATCG CCTTGGGTCC CCTC^AACAG ATTCCCATGA ATCTC TTCAT CATGTACATG 360 

GCAGGCAATA CTATCTCCAT- CTTCCCTACT ATGATGGTGT GTATGATGGC CTGGCGACCC 420 

ATTCAGGCAC TTATGGCCAT TTCAGCCACT TTCAAGATGT TAGAAAGTTC AAC-CCAGAAG 430 

TTTCTTCAGG GTTTGGTCTA TCTCATTGGG AACCTGATGG GTTTGGCATT GGCTGTTTAC 540 

AAGTGCCAGT CCATGGGACT GTTACCTACA CATGCATCGG ATTGGTTAGC CTTCATTGAG oOO 

CCCCCTGAGA GAATGCAGTT CAGTGGTGGA GGACTGCTTT TCTGAACATG AGAAAGCAGC 6SQ 

35 GCCTGGTCCC TATGTATTTG GGTCTTATTT ACATCCTTCT TTAAGCCGAG TGGCTCCTCA 720 

GCATACTCT? AAACTAATCA CTTjvTGTTAA AAAGAACCAA APGACTCTTT TCTCCATGGT 730 

GGGGTGACAG GTCCTAGAAG GACAATGTGC ATATTACGAC AAACACAAAG AAACTATACC 340 

40 

ATAACCCAAG GCTGAAAATA ATGTAGAAAA CTTTATTTTT GTTTCCAGTA CAGAGCAAAA 900 

CAACAACAAA AAAACATAAC TATGTAAACA AGAGAATJVAC TGGTGCTAAA TCAAGAACTG 960- 

45 ■ TTGCAGCATC TCCTTTCAAT AAATTAAATG GTTGAGAACA ATGCATAAAA AAAGTTGCAC 1020 

AAGTTCCTTA TTTTCCTTAA TATTTCACTT CTATTTAA7A CAAGCTGGGA CATAAAAATT . 1080 

CTGTTGGGGA TACCTGGGGG AAGATGTGAG AAACTAATGC TGAATTCAGC TTATACATGA 1140 

50 ~ 

TGAAAAGAAA AACCAGACAA AAGGAGCACA TAAATATGCA TACAGTGTAA CTGTTATTAT 1200 

TTTAATACCC ACGATAAGGG ATTTTTGTTA GCATGTTTAG GGGGAACGAG GATTGGTGGG 1250 

55 ATCCTTGGGG CCACAGGAAT CTGAGGCAAC GGAAGATATA TAGAGTGATC GTCCCCCTGC 13 2Q 

CGAAGGAACC TGGCAYCTGT CAA3CAGATG CTGCAGTTCA AACTTCAGCT TTTAAGATAG 1330 

ATAGCTATTG AAGGCAGAGG GTCAGCAGGA GGATCTGTAT TTCTAATCTA CCCTGGTAAA 14-40 

60 
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GTC AT AGGT A. AGACTCAAAA GCGGGATCTT ATTCAAAAGG CAGGTA.TTTC CTTTGTTTTC 1500 

TGTCTTGAAA TAGCCCCTTC CCCTAAGGTG CATTCTCTCA AGTTTTCAGT ATTGCTTTAT 1560 

TTGCAGTGAT TAAAAGAGAT GAGAGACTTT GGAGACAGAC AACGTAAGCA ACAC.ATACAC 1520 

ACA.TGAAA.TA CTCTAGACAG AGATGAATAT AAATCTGGCC TAATAACCAG TTTTCCATGT " 1530 

AACAGTGATT TTGTGTTTCG GGCTGAAGCA GTGGTTA.TAT TAAAAjGCCAC TAJ\TTCCCTT 1740 

ATCCCTTTAA A-AGATTTTTA CAATTCTCCA ACCACAAACA GCA.CTTCTAA AACTAA,CTTT ■ 1300 

ACTTTCTGCC CATAATTTGT TCTACATGGA AAAJ^AAJ^AAT ATTACTTTGG CCAGGGGTGT 1360 

GTGTAAATGT GGCAGAATTC CTAGGCAGGC TGACCTTTAC AGTATGGGCC TTTAAGATAC 1920 

TGGATCCTGG TTGGGCAACA. AGTGTCACGC CTGAAGTTTC TGAAAACAAA. TTAGAAGACT 1980 

GTTGGCTTGG CTAATCTCGT AGTTCAGGGC CAAGTTTCTG TAGTCA.GAAT GAAJGAATAAA 2040 

ATTGAAAGAA AAAGGGGGAA ATGCTTATAC TTGGCATTAA GTTGAATGCC TCAAGTCTTA 2100 

ACTATGGCTT' TGTAGATGAG- GCAAAAGATT TCTTAGTGGT AAAATTTCTT CAACAGGTCA 2150 

ATGC CAATCT GTATGCCATT TTAGTAAAGT AGGTAAGGAG AGTAGCCGCT CAGTAACTTT 2220 

GGCA.CTAAAG AAAGAGTGTG GCTCTAGAAC TTCCAATCCC ATTGCTAGAT GTGCCCTTTA 2230 

AAAGATGGTC CAGTGCTTTC AGGGAA.GGAT GTTTAjGCCAG TTTTCCTAGT ATTTGTTCCT 2340 

TAAGATTTTT TGACCTGTGC TTAATAAGAC GGACGCGTGG GTCGACCG 2333 



32) 



40 



45 



50 



55 



60 



(2) INFORMATION FOR SEQ ID NO: 155: 

{ i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH : 642 base pairs 
(3) TYPE : nucleic acid 

(C) STEAMDEDNESS : double 

(D) TOPOLOGY: linear 

Cxi) SEQUENCE DESCRIPTION : SEQ ID MO: 15 5: 

AAAACAGA.CC A.TT^AAAAA.C TCAGACAAGA TTATATTTAA TATATTAA.TT ACTAAAAAGG 50 
CACAAGATTA CACTGAACAT ATTAGCTACT AAAAAGGCAC TGCTAAGACA. TTCAAGCAAA • 120 

TAGCTATTAC ACACTACTGC AGATTTTA.GA GGTTTCTAAT TCTAACATAT GTTTGAAAAA 130 

TCCGTGAGTA TTCCAAAATA TA.TTTAATAA TGGAATA.TCT GCATTAATA.T ACCATCCATG 240 

TGTTTTTACC ATTTGCCTTA ATATTGAATA TACTGTTTAC CTCACACTAA. AAAGAAAACC 300 

AGAAGCCTTA TTTGTGATTT TGGGAGTGGA AGCTTCCATT TTTGTGTCAA AAATGAATCC 360 

TGATTCTTAT GGAAATCTCT GTTATTAAGA TATTTCAAGA TGAGACAACA CTGAA.GA.TCA 420' 

AATTGTGTTT AGTATCACTA TCTTCTCTCC TCGTTTCTCT CTTACTCCTC ATCCTCCCAG 430 
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AATCTACCAG TTTATGGTAG AAACATGGGA ACCTTATTTG AATGTGTTTT TTTTTTTCCA 
TGATGTCCAA TTTTGTTGTG GGAAAGGATT TGGATAAAAT TTTTGTTTAA ATTTTGGTAG 
ATTTTTATCT ATACAAATTT AAATAAAATT ATGTTTTGTA AG 

(2) IMFORMATIOW FOR SZQ ID NO: I5o : 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 125 I base pairs 
(3) TYPE: nuci-eic acid 

(C) STHANDEDNZ33 : double" 

(D) TOPOLOGY: linear 

' (:ci) SEQUENCE DESCRIPTION: SZQ ID NO: 156: 
GCCGCTGCCC CTCCACGGAG TTGCTGATCA TCTGGGCTGT GATCCACAAA CCCGGTTCTT 
TGTCCCTCCT AATATCAAAC AGTGGATTGC CTTGCTGCAG AGGC-GAAACT GCACGTTTAA 
AGAGAAAATA TCACGGGCCG CTTTCCACAA TGCAGTTGCT GTAGTCATCT ACAATAATAA 
ATCCAAAGAG GAGCCAGTTA CCATGACTCA ' TCC AGGCACT GAGCATATTA TTGCTGTCAT 
GATAACAGAA TTGAGGGGTA AGGATATTTT GAGTTATCTG GAGAAA^ACA TCTCTGTACA 
i5A.TGACAA.TA GCTGTTGGAA CTCGAATGCC ACCGAAG^AC TTCAGCCGTG GCTCTCTAGT 
CTTCGTGTCA ATATCCTTTA TTGTTTTGAT GATTATTTCT TCAGCATGGC TCATATTCTA 
CTTCATTCAG AAGATCAGGT ACAGAAATGC ACGCGACAGG AACCAGCGTC GTCTCGGAGA 
TGCAGCCAAG AAAGCCA.TCA GTAAATTGAC AACC^GGACA GTAAAGAAGG GTGACAAGGA * 
A\CTGACCCA GACTTTGATC ATTGTGCAGT CTGCATAGAG AGCTATAAGC - AG AATG ATGT 
CGTCCGAATT CTCCCCTGCA AGCATGTTTT CCACAAATCC TGCGTGGATC CCTGGCTTAG 
TGAACATTGT ACCTGTCCTA TGTGCAAACT TAATATATTG AAGGCCCTGG GAATTGTGCC 
GAATTTGCCA TGTACTGATA ACGTAGCATT CGATATGGAA AGGCTCACCA GAACCCAAGC 
TGTTAACCGA AGATCAGCCC TCGGCGACCT CGCCGCCG^JZ AACTCCCTTG GCCTTGAGCC 
ACTTCGAACT TCGGGGATCT CACCTCTTCC TCAGGATGGG GAGCTCACTC CGAGAACAGG 
AGAA AT CAAC ATTGCAGTAA CAAAAGAATG GTTTATTATT GCCAGTTTTG GCCTCCTCAG 
TGCCCTCACA CTCTGCTACA TGATCATCAG AGO CACAGCT AGCTTGAATG CTAATGAGGT 
AGAATGGTTT TGAAGAAGAA AAAACCTGCT TTCTGACTGA TTTTGCCTTG AAGX3AAAAAA 
GAACCTATTT TTGTGCATCA TTTACCAATC ATGCCACACA AGCATTTATT TTTAGTACAT 
TTTATTTTTT CATAAAATTG CTAATGCCAA. AGC TTTGT AT T.-AAAGAAAT AAATAATAAA 
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ATAAAAAAAA, AAAAACCCCG GGGGGGGCCC GGTCCCCAAT TGGCCCTATG G 



1251 



(2) INFORMATION FOR SEQ ID NO : 137: 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 2127 base pairs 

(B) TYPE : nucleic acid ■ 

(C) STRAMDEDNES3 : double 

(D) TOPOLOGY : linear 

(;ci) SEQUENCE DESCRIPTION : SEQ ID NO: 157: 



CCGGCGGGAG AGGGAAGCTG CAGCGAGAGG CGCGGATCTC AGCGCGGGAG CAGTGCTTCT 50 

GCGGCAGGCC CCTGAGGGAG GGAGCTGTCA GCCAGGGAAA ACCGAGAACA CCATCACCAT 120 

GACAACCiGT CACCAGCCTC AGGACAGATA CAAAGCTGTC TGGCTTATCT TCTTCATGCT 180 

GGGTCTGGGA ACGCTGCTCC CGTGGAATTT TTTCATGACG GCCACTCAGT ATTTCACAAA 240 

CCGCCTGGAC ATGTCCCAGA ATGTGTCCTT GGTCACTGCT GAACTGAGCA AGGACGCCCA 3Q0' 

GGCGTCAGCG CNCCCTGCAG CACCCTTGCC "TGAGCGGAAC TCTCTCAGTG CCATCTTCAA 3 60 

CAATGTCATG ACCCTATGTG CCATGCTGCC CCTGCTGTTA TTCACCTACC TCAACTCCTT 420 

CCTGCATCAG AGGATCCCCC AGTCCGTACG GATCCTGGGC AGCCTGGTGG CCATCCTGCT . 430 

GGTGTTTCTG ATCACTGCCA TCCTGGTGAA GGTGCAGCTG GATGCTCTGC CCTTCTTTGT 540 

_ . CATCACCATG ATCAAGATCG TGCTCATTAA TTCATTTGGT GCCATCCTGC AGGGCAGCCT 600 

GTTTGGTCTG GCTGGCCTTC TGCCTGCCAG CTPJ^CACGGC CCCCATCATG AGTGGCCAGG 660 

GCCTAGCAGG CTTCTTTGCC TCCGTGGCCA TGATCTGCGC TATTGCCAGT GGCTCGGAGC 720 

TATCAGAAAG TGCCTTCGGC TACTTTATCA CAGCCTGTGC TGTXATCATT TTGACCATCA 730 

TCTGTTACCT GGGCCTGCCC CGCCTGGAAT TCTACCGCTA CTACCAGCAG CTCAAGCTTG 340 

AAGGAC CCGG GGAGCAGGAG ACCAAGTTGG ACCTCATTAG OkAAGGAGAG GAGCCAAGAG 900 

CAGGCAAAGA GGAATCTGGA GTTTCAGTCT CCAACTCTCA GCCCACCAAT GAAAGCCACT 960 
CTATCAAAGC CATCCTGAAA AATATCTCAG TCCTGGCTTT CTCTGTCTGC TTCATCTTCA -. 1020 

' CTATCACCAT TGGGATGTTT CCAGCCGTGA CTGTTGAGGT CAAGTCOGC ATGGCAGGCA 1030 

GCAGCACCTG GGAACGTTAC TTCATTCCTG TGTCCTGTTT CTTGACTTTC AATATCTTTG 1140 

ACTGGTTGGG CCGGAGCCTC ACAGCTGTAT TCATGTGGCC TGGGAAGGAC AGCCGCTGGC 1200 

TGCCAAGCTG GNTGCTGGCC CGGCTGGTGT TTGTGCCACT GCTGCTGCTG TGCAACATTA 1250 

AGCCCCGCCG CTACCTGACT GTGGTCTTCG AGCACGATGC CTGGTTCATC TTCTTCATGG 1320 

CTGCCTTTGC CTTCTCCAAC GGCTACCTCG CCAGCCTCTG CATGTGCTTC GGGCCCAAGA 1330 
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AAGTGAAGCC AGCTGAGGCA GAGACCGCAG AGCOTCaTG GCCTTCTTCC TGTGTCTGGG 1440 

' TCTGGCACTG GGGGCTGTTT TCTCCTTC CT GTTCCGGGCA ATTGTGTGAC AAAGGATGGA. 1500 

C^GAAGGACT GCCTGCCTCC CTCCCTGTCT GCCTCCTGCC CCTTCCTTCT GCCAGGGGTG 1560 

ATCCTGAGTG GTCTGGCGGT TTTTTCTTCT AACTGACTTC TGCTTTCCAC GGCGTGTGCT 1520 

CGOCCCOGAT CTCCAGGCCC TGGGGAGGGA GCCTCTGGAC GGftCftGTGGG GACATTGTGG 1530 

GTTTGGGGCT CAGAGTCGAG GGACGGGGTG TAGCCTCGGC ATTTGCTTGA GTTTCTCCAC 1740 

TCTTGGCTCT GACTGATCCC TGCTTGTGCA GGCCAGTGGA GGCTCTTGGG CTTGGAGAAC 1300 

ACGTGTGTCT CTGTGTATGT GTCTGTGTGT CTGCGTCCGT GTCTGTCAGA CTGTCTGCCT 1360 

GTCCTGGGGT GGCTAGGAGC TGGGTCTGAC CGTTGTATGG TTTGACCTGA TAT ACTGGAT 1920 

TCTCCCCTGC GCCTCCTCCT CTGTGTTCTC TCCATGTCCC CCTCCCAACT CCCCATGCCC 1930 

AGTTCTT AC C CATCATGCAC CCTGTACAGT TGCGAC-GTTA CTGCCTTTTT TAAAAATATA 2040 

TTTGACAGAA ACCAGGTGCC TTCAGAGGCT CTCTGATTTA AATAAACCTT TCTTGTTTTT 2100 

TTCTCCATGG AAAAAAAAAA AAAAAAA 2127 



(2) INFORMATION FOR SEQ ID NO: 153: 

(i) SEQUENCE CHA?ACTE?.ISTICS : 

(A) LENGTH : 1525 base pairs 
(3) TYPE : nucleic acid 
(C) STRANDEDNES S : double 
ID) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID MO: 153: 



CAAAAGATCT ATAATCAGGA CATTGTTTAT GTAAGTTGGA GAANAAAAA.T TCTTCCCCTT- 50 

TATGTCCACC CTTCCTATGA TTGCAAGACA AAATTTCCCT' CCTTTACCTC ATCCCTATAA 120 

CATGGGAGGC TGAGAAAAAT GAGGGGAGAT GGAACCAGAT ACAAGGAGAT CC AATAAG AG 130 

AAGCTTATTT AAATATTGTG AAATAAAGGA AGAKCCAAAG CATTTTTTTA AGTGGGGAAT 240 

CCTTTTGAAC AGTTATTATT TATCCATATT ATTAAYAACA TCTTTTCTGA CAAAATCCAT 300 

CAGATGAAGT GTAAATGGAT AATCTTTTAA TGGATCTAAA CCTAGAAAGT TTCACTTACT 350 

GTTCATGTCC GTGTTCCAGA ATTGTGAAAT GGTGTGTGGT TTTGCTTTCC AAGTTCTTCT 420 

CTGCCTCCTC TTAATTCTCT AATTCCATGT CTTACAGAAG AATGAGAAAT TTCTTTCTTA 430 

CTTGAGTATC ATGCTCTAAA AAACTTGGCT TCAGTCACAG AAACGCTGGC TCTCCTGTGC 540 ■ 
TTATATTGAA GCCAACTGCC ' TTTAATTCTT GGGCCCTCTT ATATTTTTAA GGTGCAAAA.T * 500- 
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TTGAAGTCTC AGTCACCAGA CACAGGTTCT ATACAATTAA TGATGAGCTG GAGAAGTAAT 660 

ATGTAGCTAA TTTTTCAAAA GCATTGAATA TACTTTCCGG AAAGAAAACA GAAATTAAAT 720 

5 ATTGCCACAT CTTGCCAGAA TCCCATCTGA CACCTTAACT TTGTCAGGTT TCCTACAACT 780 

TGCTAATCAA GTTTTATACA TTCTAAATCT CCCCAGTTTC TTTGGGGCTG GAAGATGCAA 840 

CTTCCATTTA ATAGAAACTT TGAAATCTTG GGGTAAGGGA GCAGTGGGGG GACTAGGGAG 900 

10 

AAGGATAAGA AATAGAATTA TTGAAAAGCC CCCACCAGGG ACCTTCCTGG CCA£LAATATG 960 

CAGAGTAATT CCTGCTGGCT TCACCTTTGA AAG TC C CT C G AAACTATGCA GATGAAACTG 1020 

15 AGTCTGTTTT TGATATTGTC AGATGTATTC TACCTTGGAA GTCCCNACAC CTAAACTGGA 1030 

ATTCTTGTAT TTACATGTGC TCCACTGTCC CCCACACCAC CCCTCAATTC CTGCTGCCCC 1140 

TGCTAATGTT AAGCATTTTT CTCTTGTTAT CATCAGGTTG ACATTAAAAM CAGETACTTA 12Q0 

20 

CAAACTGACT TGAAGCACAG ATACTTTTAC GAATGTGAT A AAATATTTTG TTAAGAAAAG 1260 

GAAAGAGGAT GTGGGTCAAA TAAAAGACGG CATGGATGTT GATTGGTGAA TACTGGTGTA 1320 

25 AGAAAAGGGA GCTGAGGAAT TTTTATTACT GTATTTGTAA. ATGAGTTTGA AGGAATTTGT 1380 

AAATGCCACT GGTACATTTT TAAGGTGACA CATTTGCTCC TTATAAAGTT ATTAAAAATT 1440 

ACAGGGTAAG cttaaatgac GTTTGGCAGT AGTTTTACTT TATATAATCA ATATTGATAT 1500 

30 

TGTTGCTGAA CTATGTAAGT TTATGATGCA TTTTTCAGTC CCTTTTCAGA GCAAATGCTT 1550 

TTGCAATGGT AGTAATGTTT AGTTTAAATT GACTTAATAA ATTCTTACCT GAGCAAAAAA 1620 

35 AAAAA 1^25 

40' (2) INFORMATION FOR SEQ ID NO: 139: 

{ i ) SEQUENCE CHAEACTEP-X ST IC S : 

(A) LENGTH: 1537 base pairs 
(3) TYPE: nucleic acid 
45 tC) STRANDEDMES 3 : double 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION : SZQ ID MO: 15 9: 



50 CGGGGTCACC AGTTATTAGA GGAAGTAACA CAAGGGGATA TGAGTGCAGC AGACACATTT 60 

CTGTGCGATC TGC CAAGGGA TGATATCTAT GTGTCAGATG TTGAGGACGA CGGTGATGAC 120 

ACATCTCTGG ATAGTGACCT GGATGCAGAG GAGCTGGCAG GAGTCAGGGG ACATCAGGGT 130 

55 

CTAAGGGACC AAAAGCGTAT GCGACTTAGT GAAGTGCAAG ATGATAAAGA GGAGGAGGAG 240 

GAGGAGAATC CACTGCTGGT ACCACTGGAG GAAAAGGCAG TACTGCAGGA AGAACAAGCC 300 

60 AACCTGTGGT TCTCAAAGGG CAGCTTTGCT GGGMATCGAG GACGATGCCG ATGAAGGCCC 350 
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TGGAGATCAG TCAGGCCCAG CTGTTATTTG AGAACCGGYG GAAGGGACGG CACw-C-CAGC 420 

AGAA.GCAGCA GCTGCCACAG ACACCCCCTT CCTGTTTGAA GACTGAGATA ATGTCTCCCC 430 

5 

TGTACCAAGA TGAAGCCOCT AAjGG^ACAG AGGCTTCTTC GGGGACAGAA GCTGCCACTG 540 

GCCTTGAAGG' GGAAGAAAAG GATGGCATCT CAGACAGTGA TAGCAGTACT AGCAKTGAGG 500 

10 AAGAAGAGAG CTGGGAACCC TCCGTGGTAA GAAGCGAASC GTGGGCCTAA AGTCAGATGA 550 

TGACC-GGTTT GAGATAGTGC CTATTGAGGA CCCAGCGAAA CA.TCGGATAG TGGACCCCGA 720 

AGGCCTTGCT CTAGGTGCTG TTATTGCCTC TTCCAAAAAG GGCAAGAGAG ACCTCATAGA 730 

. TAACTCCTTC AACCGGTACA CATTTAATGA GGATGAGGGG GAGCTTGCGG AGTGGTTTGT 340 

GCAAGAGGAA AAGC^jGCACC GGATACGACA GTTGCCTGTT GGTAAGAAGG AGGTGGAGCA 900 

20 TTACGGGAAA CGCTGGCGGG AAATCAATGC ACGTCCCATC A^CAAGCTC-C- CTGAGGGTA.-. 360 

GGCTAGAAAG AA«AGGAGGA TGGTGAAGAG GCTGGAGCAG AGO AGGAAGA . AGGCAGAAGC L020 

CGTGGTGAAG AGAGTGGACA TCTMCAGAAG GAGAGAAAGT GGCAGAGGTG CGAAGTCTCT 1030 

25 

ACAAGAACGC TGGGGTTGGC AAGGAGAAAC GCCATGTCA-C CTACGTTGTA GCCAAAAAAG 1140 

GTGTGGGCCG CAAAGTGCGC CGGCCAGCTG GAGTCAGAGG TCATTTCAAG GTGGTGGACT 1200 

30 CAAGGATGAA GAAGG AG CAA. AGAGCACAGG AACGTAAGGA ACAAAAGAAA. AAAC^-CAAAC 1250 

GGAAGTAAGG AGAGCTGCGA GGCTCCCAGG AGAGGATGGG GACTAGGAGG AAGGGTGTGG 1320 

CATGGGTCAG TCTGGCGCCC TTGATT AC GG GCCTAGCCCC TGCTCACATC ACAGGTGTCT 1330 

35 ' 

GAA.G-AAO.GT GAGGTGGAGT GCCTAGAACT CCCGTGGTGG TCCTGAGCAG AGA.GGAGGA.T 1440. 

GTCCTCCTGC CTGCCTGAAG GTCTCCCATG AAAACACTGC TGAACTGTCT TOACACTCAT 1500 

40 GACGCTTTTT TTAAACCGTT AAAGGGAAGT TCGGTGTTGG AGCGATAC TC AA.TGTAGTCA 1550 

. GTCTACACCT GGACGTGTGG GCCACTTAAG CCCTCCCCAC CCCCATCCTA TTCCTPAATA 1520 

AAACCAGGAT AATGGAAFAA AAAAAAAAAA. AAAAAAAAAG GGGGGQCCCbl TAAAGGGNCC 1530 

CANNTTT 1537 



45 



50 



(2) INFORMATION FOR SHQ ID MO: 150: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LZSJGTK: 1342 base pairs 
55 . (3) TYPE: nucleic acid 

' (C) STPA^IDEDMESS : double 
(D) TOPOLOGY: linear 



60 



(xi) SEQUEMCE DESCRIPTION: SSQ ID MO: 150; 
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GGATGACAGA TTGCGACANA GATTTGTGAC 
ANCAGCGTAT GATCAAAGAC AAA.GGCAGGG 
5 CGCATCTGTG CCCCGAGAAT CCTTTACTTC 
GAAACAAGAA GAAAAAAACC ATTGGTTC AC 
AGCTGCTTAA CAGTCCTGCA AAAACTCTGC 

10 

TTGATGGGTT TGTAAAACAT GAAGGACCTC 
CTTCTACTTC AGGTGTGCGA GGC CTTTCTA 
15 GACCTGCAGC ACGCAATCTA GCTGGAGCTG 
GflGAATGGAT AACTACAATT TCAGATCCAA. 
ACTGTACTGA TCTAATAGAA GAAAAAGATT 

20 

TGAAAAGGCT GATGCAGCAA TCGGTGGAAT 
TTGACAATGT CGAGGTGGTT TTACAACAAA 
25 TATTACCAGA. GAGGCTGATG CTCTCTGATA 
CAAAGTGCAT GATA.GTAATG CTCGGAGTTT 
GTTTTGTACA TTTCTTTTCA AAAAGTGC CA 

30 

_ GTTAATTATT TTACTGTAGC ATAGATTCTA 
GGATTTTTAC AGTGAAGTGT TTACAGTTGT 
35 RGGGTCCTTT TKGTGAAYCC TTAAAAACTC 
CTAAARGGCT GAAAAMCCTC CAGGCCAGAC 
AGACCGGGAT TCTACTTGTT CCAAGAAAGG 

40 

TCCAAGCATG AACACAGGA.G CATGTYAAGA 
AATCTACATA TTTTGAATTA GAAACACCCT 
45 ACATTATGTC COGTAGATCA GAGGTGGTGT 
ACTTTGATGA TAAAAAAGAA CGGTATAGAT 
TATATGTTAT GCGATAACTT TAAAATAAAA 

50 

TGGAACTTTT TCCTCAAACA AACACCCCAC 
ACAGATTACT ACTACGAATG AATCAT"^AG 
55 AGCCCAAATA TCAGGAAATG TGTGTATGAT 
TAAAACAGGA TCAAGGATTA ATGGTATAAA 
AAACAGGAT C AAGGATTAAT GGTATAAAAA 

60 



412 



CCTTCCTGCT GAACTTCAGA GGGAGCTGAA. 50 

CGAGAACAGC ACTCACCAGC AGTCAGCCAG 120 

ATCTAAAGGC AGCAGTGAAA GAAAAGAAAA. . 130 

CAAAAAGGAT TCAGAGTCCT TTGAATAACA 240 

CAGGGGCCTG TGGCAGTG CC CAGAAGTTAA 3QQ 

CTGCAGAGAA ACCCCTGGAA GAACTCTCTG 350 

GTTTGGAGTC TGACCCAGCT GGCTGTGTGA 420 

TTGAATTCAA TGATGTGAAG ACCTTGCTGA 430 

TGGAAGAAGA CATTCTCCAA GTTGTGAAAT 54 Q 

TGGAAAAACT GGATCTAGTT ATAAAATACA 500 

CGGTTTGGAA TATGGCATTT GAGTTTATTC 550 

CTTATGGAAG CACATTAAAA. GTTACATAAA 720 

GCTGTGCCAT AAGTGCTTGT GAGGTATTTG 730 - 

TTATAATTTT AAATTTCTTT TAAAGCAAGT 340 

AA.TTTGTCAG TATTGCATGT AAATAATTGT 900 

TTTACAAAAT GTTTGTTTAT AAAGTTTTAT 950 

TTAATAAAGA. AC TGT ATGT A TATTTGGTAC 1Q20 

AACTCTAGGA RGCAACTACT GTTTATTATA 1030 

TGCTAAGCTC TGAAA.TYCCT GAGAGGTCTC 1140 

GTAAAGGTTC TAAACGATCT TATTGTTGTC 1200 

AAATCTTTAC TACTTTCTYC GATGGGGAGA 1250 

CACACCCACT TGAAGATTTT TTTCCTGGGA 1320 

TGTCTTTTTG CTTCTACTGG CG ATTGAGAA 1330 

TTTTCAAACG TATATAAAAT ATTTTTATGT 1440 

ATAGTTTAAA ATTCTATGCT . AGTGGATATT 1500 
ACTGACTTCA GCAAAACCCT AAAAGTAGCT • 1550 

TTTTC-TGTCT GCAACAATTT AGAAGCACTA 1520 

GGAATTTTCT AGGACAAAAC AGAJIGAAGAT 1530 

AATGGTCTAC TAAAACAGGA TCAAGGATTA. 1740 

TCTCTACTGG TTACCGGGTG GCMGGGCCA.T 1300 
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A.CAGGGTAGT GGTGGATGGA TAGTTTAGTT TGGNAAC-GGT AA 1342 



5 

(2) INFGRMAT ION FOR SEQ ID NO: 151': 

(i). SEQUENCE CKA2ACTERISTICS : 

(A) LENGTH: 770 base pairs 
10 (3) TVSS: nucleic acid 

(C) STKANDEDNES 3 : double 

(D) TOPOLOGY: linear 

(:ci) SEQUENCE DESCRIPTION : SEQ ID NO: 161: 

CTTGTGATAA TGAGTGAGTC TCACAAGATC TGGTGGTCTT 50 

'CTGCTGACGC TCATTCTCTA TCCTGCCACC CTGGGAAGAA 120 

AAGTTTCCTG AGGOCTCCCC AGCTATGTAG AACTGTGAGC 130 

ATAAATTATC CAGTCTTATA TATTTCTTCA TAGCAGTGTG 240 

TTGGTATCAC AGAGAGTGGG GTGTTGCTAT AAACACATCT 300 

GGAACTGGGT AAOGGOAA GGCTGGAACA. GTTXGAAGAA 360 

AAATATGAGA AATCTTGAAA. CTTCCTAGAG TCTTAAAGGT 420 

GGGAAGCTTT GGAACTTCCT AGAGACTTGT TTGAATGGCT 480 

GATATGGACA ATGAAGTCCA GGCTGAGCTT ATCCAGACAG 540 

ACTTGAGTAA AGATCACTCT TGCTAGGCAA AGAGACTGGT 600 

AGAGATCTGT GGAAATCTGA ACCTGAGAGA * GATGATTTA3 " 5b 0 

TCTAAGCGGC AAAACCTTCM AGAGGAAGCA GAGCATAAAC 720 

GACMATGGGA GACCAAAGTT AAACCCAATT 770 

45 (2) INFORMATION FOR SEQ ID NO: 152: 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 519 base pairs 
(3) TYPE: nucleic acid 
50 . (C) STRANDEDNE3S : double 

(D) .TOPOLOGY: linear 

(:ci) SZQUEZJCZ DESCRIPTION: SEQ ID NO: 152: 



55 GAATTCGGCA CGAGCTGAGA GGCACAGGAG CAACAGCCAG TGCCCCCTGC AGAGGACCAC 50 

TGGGGTCACA GACTTCARAC CTGATGACCT GGGCTCAGA.T CCCAGCTCTG O.CCTACCAG 120 

CCGTGTGACA AGGTGTCCTC TCTGAGCCTC AGTCACACAC TCCCTTAACG GTTGGGCCTC 130 

60 



15 

GGCACGPJ3CC CTATGCTGTT 
ATAGGCATCT GGCATTTCCC 
20 GTGTCTTCTC TCATGATTGT 
CAATTAAACC T CTTTTCT CT 
AGAAC-.GATA ATACCGTAAA 

25 

GAAAATGTTA AAGCAAATTT 
CAGTTAAGAA GAAGACAGGA 
30 CTCAGAAGAC ATGAAGA.TGT 
TTGACCAAAA TGCTGATAGT 
ACATAAGAAG CTCGCTGGGA 

35 

GGTATCTGGC AGAAGAAATA 
40 GTTTGAAAAA TTTGCAGCCT 
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ATGGAGCTGT TTGTGAAGGT TAAATGGGAA 
CMGCTGAAT ACC-ATAATGG TGGCCTCCTT 
AA.YTGGGCAG GGGTGACAGA TACCTCTTCT 
TGTCTCTCCC TCCCCCACGC AATTGGAAGG 
GAGGTTTCTC ACCGTGGTAG GGAAATTGCT 
GTCTCTCTGC AGGACGGATG AGGCTTTGCT 



414 

GACATAAAGC ACTTAGCCCA GAGCO.AGGA 
TGGCGCTGTG CTGGTGCAGG TGTGCCGAGG 
AACCTAGTTC CTTTCCAAGA ACCTAATTGG 
AGGAGGCTGG GCCCCAGCCC CAGAATACGG 
GGGTTGGGGG TGTGGGCAAC C-jGAGTGATC 
GACAGAGGC 



(2) INFOEMftTION FOR SEQ ZD NO: 163: 

(i) SEQUENCE CHARACTERISTICS : 

(A) .LENGTH: 753 base pairs 

(B) .TYPE : nucleic acid 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID MO: 153: 

v. 

GGCACGAGCG GCACGAGCAG CCAGTTGCTG J^.CTGGCACAT GGCCTCCAGC GTCCCGGCTG 
GTGGGCACAC TAGAGGCGGA GGGATCTTCT TAATTGGTAA ATTGGATCTT GAAGCTTCAC 
TGTTTAAATC TTTTCAGTGG • CTTGCCTTTG TACTTAGAAA AAAA.TGCAAC TTCTTCTGCT 
GGGACTCATC CGCTCA.CAGC CTTCCCCTCC ACCCTCTCTC TGCCTCATGC TCTGCCCCTG 
CCTGCCATGC CTCCGATACT CACGTTTTGT ACCCCAGCAC CCGTGCCCTG TGGCCCTCGA 
TCTTTGCCTG GCTGGTTGCT CCTCACTCAG TGTTCAGGAC AAATGCTCCT GGCCGTACCC 
CATCTAGCCA. GTCTAGCCCG GTGTTCGCTG . TCTTCCCTGT TTCATTGATG GCTCTTATTG 
TTTGTTWACT TGTGTGCTGT TGACTTTTAA CTGTCTCAjGT CCCCACTGGA ATGCAAGCGA 
TCTGGGAAGC TCCTAGAATT GTTCGTGCCT CTTCACAGGG CCTTACGCTG TGTGTGCTCG 
TGCCGAATTC GGCACGAGGG TATGTGCACT TGCTGGTATG TATGTAGGTG TTTGGTAACA 
CATACGTGCA CACGCAGAAT GCTTOCAGGG GACTGCACAG GGTCTAGTTC GCAGCCCCCA 
CCCCTCCCTT TGSCGGTGCA CTCTCCCCTC TGTGAGGTGC ATTCGCATGA AAGGGTGCAN 
GGTTCCTGAN CCCGCNAGCG NCACCTCCTG GGA 

(2) INFORMATION FOR SEQ ID NO: 154 : 

(i) SEQUENCE CHARACTEP.I STIC S : 

(A) LENGTH: 1400 base pairs 
(H) TYPE : nucleic acid 
( G ) STRAJMDEDNES 5 : double 
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(D) TC9CLCCY : linsajr 
Cxi) SEQUZMC2 DESC? - PTIGW : SZQ ID NO: 164: 

5 GG^-CAGTTT ATTAATACCT ATTA/TC-C-GAA AGTCA.CTTTG GTTGGCATTG AAAATTACAT SO 

CATCTTTAAA GCA.GT ATTTG TCCCCAGATG GACTCATCA.C TAGCAAAGAC TAGGTTCATT 120 

GGAAGGCATA GGGTGAGAGA ATGGGAAiAAT GRAGTGGAGG CGGGTTGTTA AAGTGCTGTC 13 Q 

10 

AGTGA.GTGAT TTTGTCTACT TGAACAA.CGG .TCCATGTTTG GGGGCATA.TT GTGTTTCATA 240 

AGAAGTGAAA GGTATTTGCA AAGTAAGCTA CAAATGACGC ATAAATCTGT TAACAACAGT 3Q0" 

15 CCTTAA.TA.TG CAAAGATGAA A.-ACAAGCA.T TACTGCTACC CAAAGGGAAC TGGTGCTTGG 3 50 

TGA.TGTGCAG ATGGGGCTGT TGGTTAAGA.G AGCTATTACA. GOTTTTCTCT- CTTAGGTTTC 420 

ATAGGAGGTA GTTACTGAGA TGAGATTGTT TTATCTTTTT GAATACAGAT CTCTTGTCTT 430 

20 

GAGTTAGTTC TGAGGATGGG AGTAATAAAG GAjGTTTTTTG TTTTTTTGTT TGTTTGTTTG 540 

TTTTGGCTCC TTAGTAATAC TCCTCTGACA TTTATTTCTA TTATTCTTCA AAGAAAGGAA. £00 

25 ACCAACTGAA ATGTTTGCTT TAACAAACAT TTTAATAAGT TCTCTGGGTT TTTTTTTCCC 650 

CTTTTAAAAA AATTAGCATA TACCA.TAGCA ATAAAAGAA.C TAATGTTAA.C TATTGTATGC 720 

TAjCAACTTAA GTGATTT TT C TAAAJGAAGCA. CAATGTCA.TT GPAAGTATTA TTGAAAAGGA 730 

30 

TCATAGTCAC ATTGAA.TTTG TGAAGGCCAA AGAAATTGAA GGGAGTGATA TTTTCATTTT 840 

ATGATATTCA CATATTTAGT AAATTTTGTG TA.CAAGAATA CCAGGCAC-AG TGTTTTACCC 900 

35 ATGGAAACAG GTTTCAGATT ACTTTGTTTT TACTGTTAGA GTCTCAAJGTT TAGAAATGCT 960 

AACACTTAAA TCAGTTTTTT TCTCACTATA CTTGAAGATT GTTAATATTT TGATATCTTC 1020 

CTAGCTTGAT GGAATTTAAA CATATCTTCA GATCTGTGAC AGTGACAGCC AATAGGACTG 1080 

40 * * 

ATAATATTAG CTTCAAACCA ATAATATCCA GGGTTAAAAT AAAAA.TCATA GTGAAAGTAG 1140 

GATTGTAAAA TTATGCTATA TTAACTTTTA AGTCTGTAAT AACTTGACAT CAAAA.TGTTA 1200 

45 TGTAATTA.CC ATAAATAATG GCTAGCGAGA ACATCTTTGG AAATTCTCAA ATTACCTTTC 1260 

TTACTACACT GTTTGCAGAA TQAA.TGTAGA. AATGATCCTG TTAGCTTTCT GAA.TGTTCTG 1220 

TGGTTGAATG TGTTTTTGCT TAAATAAA.GC TTTTGGTATT TGTTTAAATW ACAAAAAAAA. 1330 

50 

AAAAAAAAAA AAAAACTCGA 1400 



55 

(2) IMFQFMAT ION- FOP. SZQ ID NO: 165: 

(i) SEQUENCE C-^-3ACTE?£STICS : 

(A) LCIC7H: 2153 base pairs 
60 (3) TYPE: nucleic acid 



15 



25 
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(C) STHAMDEDNE3S : double 

(D) TOP.OLCGY : linear 



( :ci ) SEQUENCE DESCRI PTION : SEQ ID NO : 1 S S : 

5 

CAGGCCTCAG GGCCTCTGGT GGCTCTGGCC CAGACAGTAT TTGCAGTTCT TGTGCTATGG SO 
GTGGGAGTCT TCTTCCTCAA GTTTCGGGAG CTGTGGTGTG NCTGGATGGG GTGCTCCTCC 120 
10 CAGGGCTCAA GGGGTGTGGT CGGGTCAGGG TCTCATTTCG CCAGGGCAAG TTCAAGGCAG * 
CAGCGCTTTG TGAGGCGCTC TTGGGGGTGG GGTGGAGGGA GAAGTTTAAG CTTTTTTGGT 
CACAGGGACG TGGTATGGGC CCTGGGTGCA GGTGGGCACA TTCTGGTAAT GAGAGGTTTG 



GATCCTTCCG CTGGGGTGTA GGCTTGTTGA TTAGTATATA CTCATTOGTT CATGCTTTCC 
20 T CAGCAGAAC ACTTCGACTT CTGAGGTGAG CTTT T GCCCC RTGCCCTTCC TCCACAGGTG 
TTGCCTTTTT ATAAAGACCT GATAGCAGAA TAAATTGGTG TTTCCGTGTT GAGCCAGCAG 



130 
240 
300 



TCTGATCAGT CCTGGGTCGA TCAGTTTGTC GATGTGTCCG GGTGCGAGCG CGTCCCTTGG 3 60 



420 
430 
540 



CATTTCTGTG GGCCTAGAAT ATGGGGCTGA ACGCTTAGAG TGGGGCAGTG AGGGGTTGAG ' 600 



660 
720 
730 
340 
900 
960 
1020 

ioao 

1140 



GAGTGACGGT TCCTTTCTCA TGGTTTTAGT CATTTTGGCT GGGAGCGGTT AATGGGACAG 
ATCTGCTGCT TCTAACAGAT GGGCAGGAGG TGACACGGAT TTCAGGCATT GGCAAGGTTA 
30 GCACCCTCTC CTTTGAGGGT AGGGCGAGAC TGTTCATTGT CACTTTAGGG AAGTGCCTGT 
_ TTGGCTTTAA AGGTAAGGGT GGCAGGTGTG AGAAGCGTTG GTAACTGATG GACTCATTTC 
GTGGTGGTTA AAGATGCAGG CTCTTAAGGG CTCGTTGATG GATGGCATCT CTCCTAGCCC 

35 

CCAGGCCTGG TGGCACTGGT GGGGAGGTTC CCATTGTTTG GGGCTGGGAG GGACAGGTTG 
CCTGTTTCTG GTCACAAATT ACAGTCTTGT CTGCTGTACG ATTCTGTGGG TTGAGGATGG 
40 GGGGAGTAGC GTTTCATTAG TGTAGATAGT CATTGCCTGG TAGGGTGGAG GGTAAGACAT 
AGGGTCTGGA ACTGTTTGGG ACCTTTTGGG GATGTCCTGT GGGTCCGAGA TTGGTMGATT 
CTGGGAGGAG' AGGGTGCGGC ATTCTGCTGC TCGTGACAGG GAGCAAAGGT GGACCGACTT 1200 

45 

ACATTGAGTA TTTTGGTGGG AGTACAAAGA GTGGGAAGGG CTGGGATTTG CTGCTGCTCC 1250 
CTTAGAGCAG GGGCCGTTTT .TTCAGCACTT TGGACACCTG GAGACCCAGG CCTGTTATTT 1320 
50 AATGGTAGTG GGGAAGTGT G TGTGCATAGT GTCTGCCACT GCTTTGTCGG TGGCGCATGG 
CAGAGAGG G C TGTCCCTGCC AGGGC GAGCC TTCTTAGGGG GAACTTGGGA ACAAAGTGGA 
ACATGGGATG ATGGGTTGGG GTGGTCAGGT GAGGGCTGTC TATAGTGGTT CCCTGGGCCA 

55 

AGGTGACACC AGCGCGTGAG GGTGGGGTGG GACGGGTGGT GGTTAAAAGA GGAAGGGGAG 
CAGTGTAGGA ACTTGCCAGG GACGCGAGGG GTCCGTGTCT GGGGGTGTGG AGTGAGCATG 1620. 
60 GGGATTGGCA TGAAGGGGGG TGGGAC CTGT GGTAGTTAGG T AGGG 3GTGM TGACGGGGTC 1530 



1330 
1440 
1500 
1560 
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ACTCCTGACC ACATGCACGT TCCCTAGATG CAGACTC-CTT TGAACTTTAA AGCTGTACAA 
TTTGGTTATG TTTGTGCTGA CTTAAAATA.T ATTTTAATGA CGAAAAAATA ATGGAGAACC 
CTCGGAAGGA CCTGGTTCTT TTGCTTCTCG GGGAACTGTA AGCCCTCGCG TTCTGGGAAT 
CGCTCTCTGC TGCTCTTTCC TGGAAGCTAA GCCTGTCTCC ACCGCCCGAG GCCTGCGCCG ' 
GTGCTCCCGC CGCAGTTGCG TTTGCTTTGG ACCTTGCGTG CGGGGGAGGG GGTGCTCGGT 
CCGAGCCCGC TCCTTTCTGT ACACCTAGCG CTGCCCGCCC CGCTTGTGTC TGAGGTCGTG 
'TA.TGTCAAAA ATAAAGCCGC TAGAAACGGA AAAAAAAAAA AAAAAAAAAA AAAAAAAAAA 
AAACTCGAGG GGGGGCCGGT ' ACCCAATTAA CCCNNTATGA. TCTATAAAGC GTC 

(2) INFORMATION >- F CP. 3EQ ID MO: 156: 

{ i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH : 1251 base pairs 
(3) TV?£: nucleic acid 
(C) STFANDEDNE3 S : double 
• (D) TQFOLCGT : linear 

(:ci) SEQUENCE DESCRIPTION : SEQ ID MO: 15c: 

GCCCACGCGT CCGCCCACGC GTCCGGCGGT GCGGAGTATG GGGCGCTGAT GGCCATGGAG 

GGCTACTGGC GGTTCCTGGC GCTGCTGGGG TCGGCACTGC TCGTCGGCTT CCTGTCGGTG 

ATCTTCGCCC TCGTCTGGGT CCTCCACTAC CGAGAGGGGC TTGGCTGGGA. TGGGAGCGCA 

CTAGAGTTTA ACTGGCACCC AGTGCTCATG GTCACCGGCT TCGTCTTCAT CCAGGGCATC 

GCCATCATCG TCTACAGACT GCCGTGGACC TGGAAATCCA GCAAGCTCCT GATGAAATCC 

ATCCATGCAG GGTTAAATGC AGTTGCTGCC ATTCTTGCAA TTATCTCTGT GGTGGCCGTG 

TTTGAGAACC ACAATGTTAA CAATATAGCC AATATGTACA GTCTGCACAG CTGGGTTGGA 

CTGATAGCTG TCATATGCTA TTTGTTACAG CTTCTTTCAG GTTTTTCAGT CTTTCTGCTT 

CCATGGGCTC CGCTTTCTCT CCGAGCATTT CTCATGGCCA TACATGTTTA TTCTGGAATT 

GTCATCTTTG GAACAGTGAT TGCAACAGCA CTTATGGGAT TGACAGAGAA ACTGATTTTT 

TCCCTGAGAG ATCCTGCATA CAGTACATTC CCGCGAGAAG GTGTTTTCGT AAATACGCTT 

GGGCTTCTGA TCCTGGTGTT CCGGGCCCTC ATTTTTTGGA TAGTCACCAG ACCGCAATGG 

AAACGTCGTA AGGAGCCAAA TTCTACCATT CTTCATCCAA ATGGAGGCAC TGAAGAGGGA 

GCAAGAGGTT CCATGCCAGC CTACTCTGGG AACAACATGG ACAAATCAGA TTCAGAGTTA 

AACAGTGAAG TAGCAGCAAG GAAAAGAAAC TTAGCTCTGG ATGAGGCTGG GCAGAGATCT 
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ACCATGTAAA ATGTTGTAGA GATAGAGCCA TATAACGTCA CGTTTCAAAA CTAGGTCTAC 
AGTTTTGCTT CTCCTATTAG CCATATGATA ATTGGGCTAT GTAGTATCAA .TATTTACTTT 
AATCACAAAG GATGGTTTCT TGAAATAATT TGTATTGATT GAGGCCTATG AACTGACCTG 
AA.TTGGAAAG GATGTGATTA ATATAAATAA. TAGCAGATAT AAA.TTGTGGT TATGTTACCT 
TTATGTTGTT GAGGACCACA ACATTAGCAC GGTGCCTTGT GCAKAA.TAGA TACTCAATA.T 
GTGAA.TATGT GTGTACTAGT AGTTAA.TTGG ATAAA.CTGGG AGCATCGGTG A 

(2) INFORMATION FOR SEQ ID NO: 167 : 

(i) SEQUENCE CHARACTERISTICS : 

{A) LENGTH: 332 base pairs 
(3) TYPE : nucleic acid 
(C> STRANDEDNE23 : double 
(D) TOPOLOGY: linear 

(3d) SEQUENCE DESCRIPTION: SEQ ID NO: 157: 

GACSMTCTAG AACTATGGTC CCCGGGGACT GCAGGAATTG GGCACJMZ-CGG CTGCGGGCGC 

GAGGTGAGGG GCGCGAGGTT CCCAGGAGGA TGCCCCGGCT CTGCAGGAAG CTGAAGTGAG 

AGGCCCGGAG AGGGCCCAGC COGCCCGGGG CAGGATGACC AAGGCCCGGC TGTTCCGGCT 

GTGGCTGGTG CTGGGGTCGG TGTTCATGAT CCTGCTGATC ATCGTGTACT GGCACAGCGC 

AGGCGCCGCG CACTTGTACT TGCACACGTC CTTCTCTAGG CCGCACACGG GGCCGCCGCT 

GCCCACGCCG GGGCCGGACA GGGACAGGGA GCTCACGGCG GAYTCCGATG TCGACGAKTT 

TCTGGACAAK TTTCTCAGTG CTGGCGTGAA GCAGAGTGAC YTTCCCAGAA AGGAGACGGA 

GCAGCCOCCT GCGCCGGGGA GCATGGAGGA GAGCGTGAGA RGCTACGACT GGTCCCCGCG 

CGA23GCCCGG CGCACCC^GA CCAGGGCCGG CAGCARGCGG AMCGGAGGAR CGTGCTGCGG 

GGCTTCTGCG CCAAYTCCAG CCTGGCCTTC CCCACCAAGG AGCGCGCATT C?ACGACATC 

CCCAACTCGG AGCTGAGCCA CCTGATCGTG GACGACGGGC ACGGGGCCAT CTACTGCTAC 

GTGCCCAAGG TGGCCTGCAC CAACTGGAAG CGCGTRATGA TCGTGCTGAG CGGAAGCTGT 

GCACCGCGTG CGCCTACCGC GACCOGYTGC GNTCCCGCGC GAGCACGTGC ACAACGCCAG 

CGCGCACTGA CTTCAACAAT TCIGGCGCCG CTACGGGAAC- TCTCCCCCAC CTCATGAAGT 

CAAGCTCAAG AATACACCAA TTCTTTCTGC GCGACCCTTC TG 

(2) INFORMATION FOR ScQ ID MO : 153: 
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(i) SEQUENCE CrLAPACTERISTICS : 

(A) LENGTH: 1203 base pairs 
(3) TYPE: nucleic acid 

(C) STRAND EDNES 3 : double 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION : SEQ ID NO: 153 : 
GGGAAACTCA AAAGGATGAT GGAATGGTTG ATGGAGCCAG AGCCTAGAAG TRAAGGGATA 
CAGAGTGAAG ATAGAGGTAT TTACGTATAT TTWAATATTA GCTTTGGAAT TACGTAGGGA 
TTCTTAAGAA AAGATCATGA CAGGACAGCC ACATTTGGTA AAATGTCAGG GCAGCCAGTG 
CATGGTCCTC CTGGGGCTCC TCAGTTGACG GGTTTAAATC ATTTC CTGAT CCCCCTGCCC 
TGGTTTGAGG AATGCATACA GTACGTGAAA TGCCTGTGGT ATGAGTTGCA ATGGGCAATC 
AACCTGGGTA AATCCAAGAT TAATGATTAG TTCTAAAGAT CCAGTTGAAG TTCTAGAGTG 
GGAATTTTCC GTCAAGCARC TCAGCACAGC TTTATGCCTG TTCCTCTAAT AACGATAGGT 
AACAAATAGC TGTGTKTWCA CAGCTAGGAR GATAACCAAA TCTAGAGTTC TTGARTCTCA 
TTTAATAAAT AAKTATTATG AGTACCAACT GCATATTTCA GGGACTGCAT TTGACTCTGT 
TAAATACTGA TYCCTTAXGA CMSCCACWTO ' AG AWAACNTT A-.TCTGTCTG ATCAATAAAC 
AGCTTGACTT AGAGRGGT AA AATAGCTTGC CACAGGTWAC CCAATTAGTA GGTAACAGCG 
ACAGAATAAC AGTGCAGTTA AAATOTTAGA CTGGAGACTA ATTGCATAAG TTTGAATTTC 
AGTTCTGCTA TGTAAATTTG GGTGAGTACC • TTAATTY ACC TGAGTCTCGG TCTTTATATC 
TGTAGAATGG AGCTAATGAT ATTACTTAAT TTGCTTTATG TGAGATTAAA TGTACTAATA 
TATGTAAATC ACTTACAACA ' GCA^TTGACA TATTTGACAT ACTTAATATA TTTGCTACTA 
ATACTATTAG CAACAGCATT CTGATTTTCC AAGTTGAAAT TCAGTGTTTT CTTTTTTACT 
TTGCCATAAT TTACAATGTT GTGCTCTGTA AACCATAAAT TTCCCTGAGG TGTTGTCAGG 
TTAAAAAAAA ATCACTATGG CCCCCAPJMMA CTTGGAAAAT AGAAATGAGA CCAGCTTCAT 
CTATATTCTT TACTGCAAAT AACTTAGAAT TGTAATAC-GC 'TAATATGTAC TGGGACTTCC 
AATTTGGGAA TATGACAAAA ATAATACTAT TTAGCTAAAA CATATACAGA ACTTATTTTT 
CCTCTGAA 

<2) INFORMATION FOR SEQ ID NO: 169: 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 1307 base pairs 
(3) TYPE: nucleic acid 
(C) STRANDED MESS : double 
<D) TOPOLCGY : linear 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 159: 

GGCACGAGAG AAAAGAGGTT GAGAATGTTT TCTAGC.AGGC AGAATGTGGA TACATGTTTT 50 

CATGARTGTC CTTTGGGTGC TGTTTCTTTT AAATCCTCTG TGCACAGGGC TCTGGCCTTT 120 

ARTAAACTGT TTTTCTGTCT TACGTCATGC TGACTGGGTG CTAGGGGCTG ATTACAAAGG 130 

GGAAGAGTTG AAC--GACATC AGGGGCCGAT GAAACCAAAG GACTAGGAGT CAGGAGAACA 240 

AGTCAGGGAT TAGGAGACAG CGGTTTGGTT TATTGTTATC CAGCTGGAGG ACTCCTAGGG 300 

GCAGCAGCAG GA.GGAATACC AGGGCCACGG AGGGGCAGGA GTCTCA.CAGT GGAGGGCAGA 350 

CTCTAACAGA TGCCAGCTGA ACGCTCGCTG GCCCTGGATG TCATACGAGT TGGGGACCAG 420 

AAATCTGGGC TCAGAGAACC CGTCCAGGGA GATTTGAAGC CATC-GGTTAT CTTCTAGAGT 430 

TGATACTGAT AATATATTTT AATTTTTATT GATGTTTAAT " ACCTTCTGAA ACAGGAGGGT 540 

AAGATCAGAT GGGAAGC Z CY TCTGTTGAAG GATCTTGGGA ACCTTGGTGG TTTTTTTTTT 600 
TTGGTTTTTT TTTTTTTGAT CGAGCTGTGG ACATCCTTCT TAATTCGATT NTGAGGATTT ■ 550 

GTTTAACTAA AAAGTTCCCA AACACAGAAA GGGCCTCGCC AC CTGCTTTG GGGAGCTGTC 72 0 

TGTSCTGGGA GTGCCAGGCA TCC 3 ATGGGA CCCATCACTG CCAGTGTCTG TGCCTC CCAG 730 

AGGTCAGCCC TGTGTCTGCC CTGGCTCTGT CTCCTCTGTG ACAGGGCAGA GCATTTCTGG 340 

TCAGTTTCTC CATGGTGCCT CCCACCCCTT TGT AAAGTGG ATGGACATGA TGGAA.TTCAG 9QQ 

TTGTCTCACC CTGATAGCCT GGGTOTTGAT ATTCACTTTA CCCGCACTCA GACACAGGCG 960 

ACCTTGAAGC AGTTCTCGGT GTGTAGAGTC CACGTGA.CAG TCCCCACAGC CTCCCCAGAT 1020 

AGCTGTGTGC CTGTGCGCTA CTGCTGTGCC ATTTTCCCAA ' CTTNGGCGTT • "TCACTAAATG 1030 

CAGCTGATCT CTCTCTCTGT GCACTCGTGA TCCATGTTGA ACAATACATG TAGGTTCTTT 1140 

TTCCACGCAA TGTAAGAACA TGATATACTG TACGTTGGAA AGCATTTACC TTATTTATAT 1200 

ACCTGAATGT TCCTACTACA CAAATAAACA TATATTAAAT WCTAAAAAAA AAAAAAAAAA 1250 
CTGGAGGGGG GGCCCGGTAC CCAAATCGCC GGATAGTGAT CGTAAAC " 13 Q 7 



(2) INFORMATION FOR SEQ ID MO: 170: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH : 1524 base pairs 
(3) TYPE: nucleic acid 

(C) 3TRANDEDNESS : double 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 170: 
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GGCACGAGGT CGCCGCCGCG GGCGCCTGGA ATTGTGGGAG TTGTGTCTGC GACTCGGGTG 60 

CGGGAGGCGA AGGTCGCTGA CTATGGCTC Z CGAGAGGCTG CCTTGATGTA GGATGGCTCC 120 

5 TCTGGGCATG CTGCTTGGGC TGCTGATGGC CGGGTGCTTC ACGTTCTGCG TCAGTGATCA 130 

GAACGTGAAG GAGTTTGGCG TGACCAACGC AGAGAAGAGC AGCACCAAAG AAACRGAGAG 240 

AAAAGAAACG AAAGGCGAGG AGGAGGTGGA TGGGGAAGTC CTGGAGGTGT TCCACCCGftC 200 

10 

GCATGAGTGG CAGGGCGTTC AGCGAGGGCA GGCTGTCCCT GGAGGATGGG ACGTACGGCT 360 

GAATGTTCAG ACTGGGGAAA GaGAGGCAAA ACTCGAATAT GAGGACAAGT TCCGAAATAA 420 

15 TTTGAAAGGG AAAAGGGTGG ATATGAACAC GAAGAC GT AC ACATCTGAGG ATGTGAAGAG 480 

TGGACTGGO. AAATTCAAGG AGGGGGCAGA GATGGAGAGT TCAAAGGAAG ACAAGGGAAG 540 

gcagggtgag" gtaaagcggg -"tcttccgccc CATTCvAGGAA CTGAAGAAAG ACTTTGATGA 500 

20 

GGTGAATGTT GTCATTGAGA CTGACATGGA GATCATGGTA CGGCTGATCA ACAAGTTCAA 660 

TAGTTCCAGG TCCAGTTTGG AAGAGAAGAT TGGTGCGGTC TTTGATGTTG AATATTATGT 720 

25 CGATCAGATG GACAATGGGG AGGACCTGGT TTCCTTTGGT GGTCTTCAAG TGGTGATGAA 780 

TGGGCTGAAC AGGACAGAGG. CCCTCGTGAA ' GGAGT ATGGT GGGTTTGTGG TGGGGGGTGG 340 

CTTTTC GAGG AAGGCGAAGG TCCA3GTGGA GGCCATCGAA GGGGGAGCCC TGGAGAAGGT 900 

30 

GGTGGTCATG CTGGGGACGG AGGAGGCGGT CACTGGAAAG AAGAAGGTCG TGTTTGGACT 960 
GTGCTCCCTG GTGCGCCACT TCCCCTATGC CGAGCGGCAG TTGCTGAAGG TCGGGGGGGT ' 1Q2Q 

35 GCAGGTCCTG AGGACGCTGG TGGAGGAGAA GGGGACGGAG GTGCTCGCCG TGGGGGTGGT 1030 

CACACTGGTG TACGACGTGG TCACGGAGAA GATGTTCGCC GAGGAGGAGG CTGAGGTGAC 1140 

CCAGGAGATG TCCCCAGAGA AGCTGCAGCA GTATCGCCAG GTACACCTCC TGGCAGGGCT 1200 

40 

GTGGGAACAG GGGTGGTGCG AGATCACGGC CGACCTCCTG GCGOTGCCCG AGGATGATGG 1260 

CCGTGAGAAG GTGGTGCAGA CACTGGGCGT GCTCCTGACG ZSZCVGCCGGG ACCGGTACCG 1320 

45 TCAGGACCCG CAGGTCGGGA GGACAGTGGG GAGGCTGGAG GCTGAGTACC AGGTGGTGGG 1330 

CAGG CTGG AG GTGCAGGATG GTGAGGACGA GGGCTACTTC CAGGAGGTGG TGGGGTCTGT 1440 
- CAACA3GTTG CTGAAGGAGG TGAGATGAGG CCCGACACCA GGACTGGACT GGGATGGCGC ' 1500 

50 

TAGTGAGGGT GAGGGGTGCG AGCGTGGGTG GGCTTCTCAG GGAGGAGGAG ATGTTGGGAG 1560 

TGCTGGGTTG GGGATTAAAT GGAAACGTGA AGGCCAAAAA AAAAAAAAAA AAAAAAAAAA. 1620 

55 AAAA 1524 

60 (2) INFORMATION FOR SEQ ID NO: 17 1: 
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(i) SEQUENCE OiAPJ^CTERISTICS : 

(A) LENGTH: 20G3 base pairs 
(E) TYPE: nucleic acid 
5 - (C) . STRAMDEUNES3 : double 

(D) TOPOLOGY : linear 

(:<i> SEQUENCE DESCRIPTION : SEQ ID MO: 171: 

10 GGCACGAGCC AGCTTGCAGG AGGAATCGGT GAGGTCCTGT CCTGAGGCTG CTGTCCGGGG 50 

CCGGTGGCTG CCCTCAAGGT CCCTTCCCTA GCTGCTGCGG TTGCCATTGC TTCTTGCCTG 120 

TTCTGGCATC AGGCACGTGG ATTGAGTTGC ACAGCTTTGC TTTATC OGGG CTTGTGTGC? ' 30 

15 

GGGCGCGGCT GGGCTCCCCA TCTGCACATC CTGAGGACAG AAAAAGCTGG GTCTTGCTGT 240 

GCCCTCCCAG GCTTAGTGTT CCCTCCCTCA AAGACTGACA GCCATCGTTC TGCACGGGGC 300 

20 TTTCTGCATG TGACGCCAGC TAAGCATAGT AAGAAGTCCA GCCTAGGAAG GGAAGGATTT 350 

TGGAGGTAGG TGGCTTTGGT GAC^CACTCA CTTCTTTCTC AGCCTCCAGG ACACTATGGC 420 

CTGTTTTAAG AGACATCTTA TTTTTCTAAA GGTGAATTCT CAGATGATAG GTGAACCTGA ■ 430 

25 

GTTGCAGATA TACCAACTTC TGCTTGTATT -.TCTTAAATGA CAAAGATTAC CTAGGTAAGA 540 

AACTTCCTAG GGAACTAGGG AACCTATGTG TTCCCTCAGT GTOGTTTCGT GAAGCCAGTG ' 600 

30 ATATGGGGGT TAGGATAGGA AGAACTTTCT CGGTAATGAT AAGGAGAATC TCTTGTTTCC 650 

' TCCCACCTGT GTTGTAAAGA TAAACTGAGG ATATACAGGC ACATTATGTA AACATACACA 720 

CGCAATGAAA CCGAAGGTTG GGGGCCTGGG CGTGGTCTTG CAAAATGGTT CCAAAGGCAC 730 

35 

CTTAGCCTGT TCTATTCAGC GGCAACCCCA AAGCACCTGT TAAGACTCCT GACCCCCAAG 840 

TGGCATGCAG CCCCGATGGC CACCGGGACC TGGTGAGCAG AGATCTTGAT GACTTCCCTT 900 

40 TCTAGGGCAG ACTGGGAGGG TATCCAGGAA TGGGGGCGTG CCCCACGGGC GTTTTCATGC .960 

TGTACAGTGA. CCTAAAGTTG GTAAGATGTC ATAATGGACC AGTCCATGTG ATTTCAGTAT 1020 

ATACAACTCC ACCAGACCCC TCCAACCCAT ATAACACCCC ACCCCTGTTC GCTTCCTGTA 1030 
45 - 

TGGTGATATC ATATGTAACA TTTACTCCTG TTTCTGGTGA TTGTTTTTTT AATGTTTTGG ' 1140 

TTTGTTTTTG ACATCAGCTG TAATCATTCC TGTGCTGTGT TTTTTATTAC CGTTGGTAGG 1200 

50 TATTAGACTT GCACTTTTTT AAAAAAAGGT TTCTGCATCG TGGAAGCATT TGACCCAGAG 1250 

TGGAACGCGT GGCCTATGCA GGTGGATTCC TTCAGGTCTT TCCTTTGGTT CTTTGAGCAT 1320 

CTTTGCTTTC ATTCGTCTCC CGTCTTTG G T TCTCGAGTTC AAATTATTGC AAAGTAAAGG 1330 

55 

ATCTTTGAGT AGGTTCGGTC TGAAAGGTGT GGG CTTTATA TTTGATCCAC ACACGTTGGT 1440 

CTTTTAACCG TGGTGAGCAG AAAACAAAAC AGGTTAAGAA GAGCCGGGTG GCAGCTGACA 1500 

60 GAGGAAGCCG CTCAAATACC TTCACAATAA ATAGTGGCAA TATATATATA GTTTAAGAAG 1560 
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GCTCTCCATT TGGCATCGTT TAA.TTTATAT 
ATTTTCATCC TGCAAGCAAC TCPiAAATATT 
AAATCTTTGC TTGATAAGTA TTAAGAAATA 
TTGATTTTGT TTCCGTTTGG A.TTTTTGGGG 
GAAGTCTGAG ACCTTCCGGT GCTGGGAACA 
CTGCGCATGT CTCTGATGCT TTGTATCATT 
AACAGTATTA TCAAAGAGAA AAAAAAAAAA 
TTCGCCCTAT AGTGAGCCMA TTC 



423 

GTTATGTTCT AA.GCACAGCT CTCTTCTCCT 
TAAAATAAAG TTTA.CATTGT AGTTATTTTC 
TTGGACTTGC TGCCGTAATT TAAAGCTCTG 
GAGGGGAGCA CTGTGTTTAT GCTGGAATAT 
CA.CAAGAGTT GTTGAAAGTT GACAAGCAGA 
CTTGAGCAAT CGCTCGGTCC GTGGACAATA 
AAAAAACTCG XGGCGC^GCC CGGTACCCAA 



(2) INFORMAT ION FOR SEQ ID NO: 172: 

(i) SEQUENCE G^JIACTERXSTICS : 

(A) LENGTH: 735 base pairs 
(3) TYPE: nucleic acid 

(C) STRANDEDNE3S : double 

(D) TOPQLCGY : linear' 

(:<i) SEQUENCE DESCRIPTION : SEQ ID MO: 172: 
GGCACAGCGG CACGAGAAGA CTTTGGTGTT TAAGAGATTA ATGTGTTAGC CAGAACAACT 
CATTTCTCTA CG2GTGTGTA GTCCATTTAT CTTTAAAGAT TTTCTATTGG AATAATTTTG 
AAATTACTTT CTTAGTTTTC TTCATTAAAA ACTAAGAAAA TGCTTTGTTT ATTATGAATT 
GCTATTTCTC TTGATTATTA TTCTTGGAGA. AAGTCTATCA GACGTAATTC TTCTGATTTG 
CTTCTAGGCT AGAGGAAAAT GTGAAAGATG ACAAATGAAA ATTTCAAAGG TTGTCAGTAG 
TATGACTTCT TTTATCGTTT GTCATTATCA CAAATATATC A^CATAGGAC TTTTAAAAGA 
TATTTTGTAC ATATTGGGCC TTAGTAGGAT TTTGCATGAA TTTTTTTTTT CTTTTATGCC 
CAGPsGAGAAA GAGCAAAGAA ATAACCAAGG GTGATGTACT CGTATTGAAG GTTTACCAAA 
TAAGGACTGC TTTTATTATG AACTATAGTC TATATTCTAA GTAAATCAAT TTTTCTATTA 
TGTGTTTTTT GTTCCTGCAG GCAAGATCTC TGAACTTTAT GCAGAGGGTT CTTTTAAAAA 
AACAAAGTTG AATTTTTTTA TTTCTTGGAA TATTTTTTTT CATTGATTTC TCCCAAGTAG 
AGCAGATTCA AATCTCCTTT GTACCCTATG TCTTTTTTGT TTTGCTATTA GCTCAGTATT 
CCGTTTCTAC ATTTTCCTTT CCTAGAACCA GTCAATAAAT GACAAAAAAA AAAAAAAAAA 
ACTCGA 
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10 



15 



20 



25 



30 



35 



40 



45 



50 



55 



60 



(2) ElFORMftT ION FOR SEQ ID NO: 172: 

(i) SEQUENCE C-^iLACTS?.ISTIC3: 

(A) L3JGTK: 1753 base pairs 
(3) TT?E: nucleic acid 

(C) STPAWDEDNESS : double 

(D) TOPGLCGY: linear 

(:ci) SEQUENCE DESCRIPTION: SEQ ID itfO : 173: 

GGGACGAGCC CTGCCCACCT CCTGCAGCCT CCTGCGCCCC GCCGAGCTGG CGGATGGAGC 

TGCGCACGGG GAGCGTGGGC AGCCAGGCGG TGGCGCGGAG GATGGATGGG GACAGCCGAG 

ATGGCGGCGG CGGCAAGGAC GCCACCGGGT CGGAGGACTA CGAGAACCTG CCGACTAGCG 



CCTCCGTGTC CACCCACATG ACAGCAGGAG CGATGGCCGG GATCCTGGAG CACTCGGTCA 
TGTACCCGGT GGACTCGGTG AAGACACGAA TGCAGAGTTT GAGTCCAGAT CCOAAGCCC 
AGTACACAAG TATCTACGGA GCCCTCAAGA AAATCATGCG GACCGAAGCT TCTGGAGGCC 
CTTGGGAGGC GTCAACGTCA TGATCATGGG TGCAGGGCC?. GCCCATGCCA TGTATTTTGC 
CTGCTATGAA AACATGAAAA GGACTTTAAA TGACGTTTTC CACCACCAAG GAAACAGCCA 
CCTAGCCAAC GGTATTTTGA AAGCGTTTGT CTGGAGTTAG AAAGTTCTCT TCTTCAACAG 
GTCCCTCCCC AGGGTGTTCC TCGCTGTGAC CCAGCCGCCT CGACTTCGGC CCGCTTGCTC 
ACGAATAAAG AACTCAGAGT TGTGTGTGCA ATGCACACCC AGACACACGC ACGOjCAIAC 
ACGCGCGCGC ACACACATGC ' TTTTTTCTGT TCCCCTCCGC TTTCTGAAGG CTGGGGAGAA 
ATCAGTGACA GAGGTGTTTT GGTTTTATTG TTATGTGGGT TTTCTTTTGT ATTTTTTTTG 
TTTGTTTTGT TTTTAAACAT TCAAAAGCAA TT AATG ATGA GACATAGGAG AAACCCTGAA 
T AG AftACAAA . ACTTTTGAAT GCTGGATTCA AAAAAAAAAA AAAGTT AT CT GGACAGCTTC 
TTTGAGACTA TTTAAAAACT GGTACAACAG GTCTCTACAA CGCCAAGATC TAACTAAGCT 
TTAAAAGGTC AAGAAGTTTT ATGGCTGACA AAGGACTCGC GGAACGCAGA AGGCCTTTCC 
CACCTTAAGC TTCCGGGGAT CTGGGAATTT TACCCCCATT CTCTTCTGTT TGTCTGAGTC 
TCATCTCTCT GCAAGCAAGG GCTGAAATCA TTTTGTTTGG TTGTTTTGAG GGAGAGAGGC . 
GGGGTGGGGG GGTGCAAATC TGCCAGCAGC TCTTACGTAA GGCATGTTTT ATTGGGGAGG 
GCTGAGCTTT TATTTTCTCC TCTCCAGTGG GGTTGGCTTT TATTGTTTCT TGTTTGGGTT 
TGGAATGGAA ATATGGATAG CA3CATAAAG TACTTTTATT TTGACAAAA.T TCATTTTTTT 
CAACAATGGA GACATAGATT TGACCCACAA TAACTTCTCC CCCTCTCTTT' TTACTCTGCT 
CAAAAAGCAT CTCTCCTCCC ATTACCCAAC CTTGGTCATA AGTOTGCCTG GCTGGTTTGC 
AGATATTTGT TCTGCTTTGT AAAAATTGGC CATTAGTGCA TTTATTGAGA TGA.TCTCTAA 



60 
120 
130 
240 
300* 
360 
420 
430 
540 
600 
660 
720 
730 
340 
900 
960 
1020 
1080 
1140 
1200 
1260 
1320 
13 30 
1440 
1500 
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AGAGCTATGC CCTGACCTAC CCCTGATTCT ATGACATTC-G GGCCCTTCTT TTGCTGAAAC 1360 

TGCCTTACGT AATGGTTTTA CTCCTTGAAA GAGATTTTGAC GGAATCCATT TTATGCCAAG 1820 
5 . 

TGCTGCCCTG CACTGTTTCT GCAATATGTG 'GTGTATGCTG TGGTGATCTT GCTGGGAATG 1530 

ATTATAAGTG TGTGTGTGGT GGGGGAGTGG GTATTACATG CATTGCTGAA GAGTCAAAAA " 17 40 

10 AAAAAAAAAA AAACTCGA 1753" 



15 (2) INFORMATION FOR SEQ ID . MO : 174: 

(1) SEQUENCE CHARACTERISTICS : 

< A) LENGTH: 383 base pairs 
(2) TYPE :^ nucleic acid 

(D) TOFOLCGY: linear 



Cxi) SEQUENCE DESCRIPTION : SEQ ID MO: 174: 
25 CTGTTAGAA? GCOCAGTTTA CCTGGATGGC AACCCAACAG TGCTCCTGCC CACCTGCCCC 50 



TCAATCCTCC TAGAATTCAG CCCCCAATTGf' CCCAGTTACC . AATAAAAACT TGTAC-.CCAG 120 



CCCCAGGGAC AGTCTCAAAT GCAAATCCAC AGA-GTGASMC .ACCACCTCGG GTAG.-A.TTTG 130 

30 ' 

ATGACAA.CAA TCC CTTTAGT GAAAGTTTTC AAGAACGGGA ACGTAAiGCAA CGTTTACGAG 240 



AA.CAGCAA.GA GAGACAACGG ATCCAACTCA TGCAGGAGGT AGATAGACAA AGAGCTTTGC 300 



35 AGCAGAGGAT GGAA^TGGAG CAGCATGGTA TGGTGGGCTC TGAGATAAGT ACT ACT AGG A 350 
CATCTGTGTC CCAGATTCCC TTCTACAGTT CCGACTTACC TTGTGATTTT ATGCAACCTC 420 



TAGGAC CCCT TCAGCAGTCT CCACAACACC AACAGCAAA.T GGGGCAGGTT- TTACAGCAGC 430 

40. 

AGAATATACA ACAAGGATCA ATTAATTCAC CCTCCACCCA AACTTTCATG CAGACTAATG 540 



AGCGAGGCAG GTAGGCCCTC CTTCATTTGT TCCTGATTCA CCATCAATCC CTGTTGGAAG 500 



45 CCCAAATTTT TCTTCTGTGA AGCAGGGACA TGGA-A.TCTT TCTGGGACCA GCTTCCAGCA 660 



GTCCCCAGTG AGGCCTTCTT TTACACCTGC TTTAC CAGCA GCACCTCCAG TAGCTAATAG 720 



CAGTCTCCCA TCTGGCCAAG ATTCTACTAT AACCCATGGA C\CAGTTATC CGGGATCAAC ■ ' 730 

50 

CCAATCGCTC ATTCAGTTGT ATTCTGATAT AA.TCCCAGAG GAAAAAGGGM AAAAAAAARA ' 340 



AMAAHAAARA ARAAAGGAGA TGATGATGCA GAATTCCACC AAGGCTGC 383 



55 



(2) INFORMATION FOR SEQ* ID NO: 175: 
60 (i) SEQUENCE CHARACTERISTICS : 
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(A) LZMGTH: 2379 base pairs 
(3) TYPE: nucleic acid 

(C) STHAHDS3NESS : double 

(D) TOPOLOGY" : linear 

5 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 175: 

GGCAGAGCTA GTGTGGACTC CATCCCCCTG GAGTGGGA.TC AJCGNCTATGA CCTCAGTCGG 60 

10 . GACCTGGAGT CTGCAATGTC CAGAGCTCTG CCCTCTGAGG ATGAAGAACG TCAGGATGAC 120 

AAAGATTTCT ACCTCCGGGG AGCTGTTG3C TTATCAGGGG ACCACAGTGC CCTAGAGTCA 130 

CAGATCCGAC AACTGGGCAA AGCCTGGATG ATAGCCGCTT TCAjGATACAG CAAACCGAAA 240 

ATATCATTCG CAGCAAAACT ^CCCACGGGGC CGGAGCTAGA CACCAGCTAC AAAGGCTACA 200 

TGAAACTGGT GGGCGAATGC AGTAGCAGTA TAGACTCCGT GAAGAGACTG GAGCACAAAC 360 

20 TGAAGGAGGA AGAGGAGAGC CTTCCTGGCT TTGTTAACCT GCATAGTACG GAAACCCAAA 42 Q 

CGGCTGGTGT GA.TTGA.CGGA TGGGAGCTTC TCCAGGCCCA GGCATTGAGC AAGGAGTTGA 43 Q 

GGA.TGAAGCA GAACCTCCAG AAGTGGCAGC AGTTTAACTC AGACTTGAAG AGCATCTGGG 540 

25 

CCTGGCTGGG GGACACGGAG GAGGAGTTGG, AA.CAGCTCCA GCGTCTGGAA. CTCAGCACTG 60 0 

ACATCCAGAC CATCGAGCTC CAGA.TCAAAA AGCTCAAGGA GCTCCAGAAA GCTGTGGACC 660 

30 ACCGCAAAGC CA.TCA.TCCTC TCCATCAATC TCTGCAGCGC TGAGTTCAGC CAGGCTGACA. 72 0 

GGAAGGAGAG CCGGGACCTG CAGGATCGCT TGTSGCAGAT GAATGGGCGC TGGGACCGAG 73Q 

TGTGCTCTCT GCTGGAGGAG TGGCGGGGGC TGGTGGAGGA TGCCCTGA.TG CAGTGCCAGG S4Q 

35 

GTTTCCATGA AA.TGAGCCAT GGTTTGCTTC TTATGCTGGA GAAGA.TTGAC AGAAGGAAAA 900 
ATGAAATTGT CCCTATTGAT TCTAACCTTG • ATGCAGAG AT A-CTTCAGGAC. CATCACAAAC • 960 

40 AGCTTATGCA' - AATAAAGCA.T GAGCTGTTGG AATCCCAACT CAGAGTAGGC TCTTTGCAAG 1020 

ACATGTCTTG CCAAGTACTG GTGAATGCTG AAGGAACAGA CTGTTTAGAA GCCAAAGAAA 103 Q 

AAGTCCATGT TATTGGAAAT CGGCTCAAAC TTCTCTTGAA GGAGGTCA.GT CGTCA.TATCA 1140 

45 

AGGAACTGGA GAAGTTATTA GACGTGTCAA GTAGTCAGCA GGATTTOTCT TCCTGGTCTT 1200 
GTGCTGATGA ACTGGACA.CC TCAGGGTCTG TGAGTCCCAY ATCAGGAAGG AGCACCCCAA . 1250 

50 ACAGAjCAGAA AACGCCACGA GGCAAGTGTA GTCTCTCACA GCCTGGACCC TCTGTCAGCA 1320 

GTCCAGATAG CAGGTCCACA AAAGGTGGCT CCGATTCCTC CCTTTCTGAG CCARGGCCAG 1380 

GTCGGTCCGG CCGCGGGTTC CTGTTCAGAG TCCTCCGAGC AGCTCTTCCC CTTCAGGTTC 1440 

55 

TCCTGCTCGT CCTCATCGGG CTTGCCTGCC TTGTACCAAT GTCAGAGGAA GACTACAGCT 1500 

GTGCOGTCTG CAACAACTTT GCCCGGTCAT TCCACCCCAT GCTCAGATAC ACGAATGGCC 1560 

60 CTCCTCCA.CT CTGAACTAAG CAGATGCCA.T CTGCAGAAGT GCTGGTAGCA TAAGGAGGAT 1620 
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25 



CGGGTCATAA GC a ATCCCAA ACTACCAACA AG AGG AC CTT GATCTTGCCG AAAJGCO£TCG 1S30 

GTGTGGCAGC TTTAGCCTCC TCCAGATCAC ATGTGTGCAA ATT ATGGC TT CA.GAGGTGGA 1740 

AGA.TAAA.CAG TGACGC-GC-GA ACAAACAGAC AACAAGAAGG TTTGGAAGAA ATCTGGTTTG 1300 

AGACTCTGAA CCTTAGCACT AAGGAGA.TTG AGT AAGGAC C TCCAAAGTTC CCCGGACTCA 13 6*0 

TGAATTGTGG GCCCTTGGCC MATTCTGTGG ACAGCCAAGG ACTTCAGTAG AC CATCTGGG 1920 

CAGCTTTCCC ATGGTGCTGC TCCAACCATC AGATAAATGA CCCTCCCAAG CACCATGTCA 1980 

GTGTCGTACA ATCTACCAAC CAACCAGTGC TGAAGAGATT TT AG AAC CTT GTAACATACA 2040 

ATTTTTAAGA GCTTATATGG CAGCTTCCTT TTTACCTTGT TTTCCTTTGG GGC ATGATGT 2100 

TTTAACCTTT GCTTTAGAAG CACAAGCTGT AAATCTAAAA GC<TACTTTTT TTTAGACGTA 2160 

TAAAGAAAAA CTAGA.TGTAA TAAATAAGAT CATGGAAGGC TTTATGTGAA AAAAGTTGAA 2220 

TGTTATAGTA AAAAAAAAAG ATATTTATGT ATGTACAGTT TGCTAAAGCC AAGTTTTGTT 2230 

TGTATTGATT TCTTTGCATT TATTATAGAT ATTATAAAAT AAAAAAAAAA AAAAAAAAAG 2340 

TCGAGGGGGG GCCCGGTACC CAATTC'GCCC TATAGTGAG 2379 



30 



35 



40 
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(2) INFORMATION FOR ScQ" ID NO: 17 S: 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 13 43 base pairs 
(3) TYPE: nucleic acid 

(C) STRAND EDMESS : double 

(D) TOPOLOGY : linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID MO 
GCGCCTTCAC GATGCCGGCG GTCAGTGGTC CAGGTCCCTT 
TCCTGGACCC CCACAGCCCT GAGACGGGGT GTCCTCCTCT 
TCAGCTTCAA AGGCCCAAGG CTGGCATTGC CTGGGGCTGG 
ATGGAGGTGA. GGGGCAGGGG TGGGGACCGC TATGCCCAGG 
GGCTGTSACT TGGTGGGGAG TGGGTCTGTC ACAGCCATCC 
GCCTGGGACA GTGCCAGGCA CCCCAGGACC CCTTCCAGGC 
TCAACACCCC CCACCCCTGC CGAAGCTGTT TCTCCTCTGC 
GGACTTCTCT CTTCTCCTCT GCCTCTCCTT GGACCCCTGC 
GTGAACACAC AGACACATGC TCACACACTA AGTCCCARGC 
CCAGCACAAA CCTCCACTCT CCCGGCTCCA TCCCARG3GG 



17b: 

ATTCTGCCTT CTCCTCCTGC . 60 

ACGCAGGTTT GAGTACAAGC 120 

AATACCCTTC TGGAGCCATC 130 

GTCCCTCAAA GTGCTGGAGG 240 

TCTGTCCAGG GTGGGGCAAG 300 

TTGTCTCCTG CTCCACCGCC 360 

CTCTCTMNTT CCCTGCCCCA 420 

CCTTCCTCTA OCTCTGACCT 430 

ACACMSAAAG GGAATGTGGA 540 

. C CTGTGGCTG GC C ATGAAAA 60 Q 
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CTGGGGGCTA CCTGGAGGGA AGCATCCT CA TCCCAGGTGA GTGGGCAC CA G-CCCTTCCCT 850 

GTATGTGTGT TGTGGGTGGA AGCAGGCATG AGAGCATCT? AGCCCATAGG TTTGTATTCA 720 

GGGACTTCCA AftCCCAGACC TACAAAGAGT GTGTCTTCTA CCAGATCTTG TTCAAAAAAG 730 

GGTTTGTGAT GATGGAACTA CACGATAGAG GGAGTGAGCA AGAACAATGA GGATTAGAGT 840 

GGAGCGTGAA ATAGTCTAGG AGGATGGCTT CCAAAACATA TGCTGTGAGG TCTGTCO.CC 900 

TGAGAGTTGG GCCATGGATT TAATTCTGAG CCTGTTAGCA GGCAAAGCAA AGACAGAAAG 960 

CAGATCGGCT GTGGATTTCT GTCTATAAAA TGTGAGTTCT TGGGCGGGTG CGGTGGCTCA 10 20 

CGCCTGTAAT • CCCGGCGCTT TGGGAGGCCA GGGCGGATGG GTCGCGAGGT CAGGAGGTTG 108Q 

GAAACCXTCC TGGCCGGAAT GGTGAAGCCC TGACTGTACT AGAAGTGCAA AGATTGCCTG 1140 

GGTGTGGTGG CGTGCGCCTG TGGTCCCAGC TTCTCGGGAG GZTGAGGCGG GAGAGTTGCT 1200 

TGGGCCTGGG AGGCGGAGGT TGCGGTGAGC TGAGATCGTG CCATTGCACT TCAGCCTGGG 1260 

GAGAGAGGCA GACTCTGGCT CAAAAAAAAA AAAAAAAAAA ACTCGAGGGG GGCGCGTACG 13 20 

CAATTCGGGG MATATGATCG TAAACAAT 1343 



. (2) INFORMATION FOR SEQ ID NO: 177: 

(-i) SEQUENCE GHA?A.CTE?.ISTICS : 

(A) LENGTH : 1502 base pairs 
(3) Tr?E: nucleic acid 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 177: 



CTCAAAATAA ATAAATAAAT AAAAATTTGT ATTCCATTGA TTTGGGTAGA CACCAGGAAT 60 

GTGCATTTCT AACAAGCTTT CCA.GGCGATC CTATAGTAAG TCATCTGTGG ACT ACTTT AA 120 

GAAACTCTTC TATAGAGAAT GGAGTTGGAT TAATAATAGG TGATTTTTTA CACTGGACTG 130 

ATTCACAAGA ACCTAAACAG TAGTCCATGA AGCTGCTCAT CTGTGGTAAC TATTTGGCCC , 240 

OGTCTCACTC TGAAAGCAGC AGGAGATGTT GTTTACTTTG TTTCTATCCC CTTTGTCTGG 300 

AGATTAATTT TGGAATGAAA GTTTTTCTCT CTATGCCATT CCTGGTTCTT TTCCAAAGCC 360 

TCATACAA.GA GGATTAGGTC ACAATGCATG CATTACCTTT TAAAAGAATG CGATATTGAT 420 

ACCGATGCTT ACTTTTTTTT TTTTTNACTA CTTGTTTTAT TCCTTCCAGN AAAGTATAGC 48 0 

CCGCCTTTCT ATAGCATAGT TCTCTTTAGG TGGAATGATT CCTATAAGAT TTCTCATTAT 540 

TAAATCATGC ATTTTTCAAG ATGGAATCAA TOTTTGAXTT AATCTAAGCT GATATTCTCA 600 

TTTGTTAGAA GAACAA.CCTA GATGCTAGAG AGAGAGGAGG AAATA.TACCC ACGACCACAC 660 
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AGCCAGTTAG TATCCAGTTG G7GCTGGACT CCAGCOjC-GT GTCCTGCCTC ATGGTAGTTA 720 

AATGATATAT AGAAAAGGTA AATTTTTAAA GAAATATTTA TTAATATATT CCTATAAAAC 730 

5 

ATTTTAAAGG TAA.CCACATA AAAATGGTTA ATTTTTCCAT TCCAAAGTAA ATGCTAAGCA 340 

TGTTTATTAA TGAAGCAGTA CTTCTGATTA GTATATGACA TTCTGAAGTT AATTAAAjCTC 900 

10 ATTGCACTAA ATGTGTCTTC CTTGGTATAG TGGAGGATTT GAGGATTGGA ATATAGAGTA 960 

GAGTGCTTCC TTAAGCCTGG GAGCCCATCT TTATAGCTAT TTGATGTAAG AAAAGAGACA 1020 

TGGMCCATTT CTAAACTATA TAAGGTGAGT GTGTCTATTC CCAGCAGATA TAAAGGAAAA 1080 

15 

AGGAAACTTT TTTGATTCCC ACCTTCCCAG CCTCACCTAG CCATCTTCCA GCCTGAAATA 1140 

■TAGAGATGTT AGTGCAAGGT CCTGGGGTCT AGGTGATCAT TTCATAAGTC CTTTACAGAT ; 1200 

20 AAAGAAAAAG TAGTGTTTGT ATGTTTGTTT TTAAGTAACC CCAAAACAAA TTTATATTGT 1250 

ATTCAGCAAA ATTGGAATTC AGGTGTTTAA TTTTAGAACA TGAAGTGCCT GCTGTTTL'AA 1320 

GCATTGACTT GTATAAAAAG AATTGCATGT CTCCAGTAAG CTTATGGGTT TTCTCATTTT 1330 

25 

TAGGTATATG GCTTTTAATC ATGTAAAGTG AAAC^TTAGT TTTCTTGCAT TTTATTACAG 1440 

GTTCTTTGTT GCAATAAAGA TGCTGCTGAA ATTAATTGAA AAAAAAAAAA AAAAAAACTC 1500 

3 0 GA 1502 



35 (2) INFORMATION FOR ScQ ID NO: 173: 

(i) SEQUENCE C>LAFACTSRISTICS : 

(A) LENGTH: 1637 base pairs 
(3) TYPE: nucleic acid 
40 (C) STFANDEDMES S : double 

(D) TOPOLOGY: linear 

SEQUENCE DESCRIPTION: 3EQ ID NO: 173: 

45 ATTTTCTAGC CCACAAGGA.C TGAA.GTTC.iG ATOCAAAAGT TCACTTGCTA ATTATCTTCA SO 

CAAAAATGGA GAGACTTCTC TTAAGCCAGA AGATTTTGAT TTTACTGTAC TTTCTAAAAG 120 

GGGTATCAAG TCAAGATATA AAGACTGCAG CATGGCAGCC CTGACATCCC ATCTACAAAA - " 130 

50 

CCAAAGTAAC AATTCAAACT GGAACCTCAG GACCCGAAGC AAGTGCAAAA AGGATGTGTT 240 

TATGCCGCCA AGTAGTAGTT CAGAGTTGCA GGAGAGCAGA, GGACTCTCTA ACTTTAGTTC 300 

55 CACTCATTTG CTTTTGAAAG AAGATGAGGG TGTTGATGAT GTTAACTTCA GAAAGGTTAG -360 

AAAGCCCAAA GGAAAGGTGA CTATTTTGAA AGGAA.TCCCA ATTAAGAAAA CTAAAAAA.GG 420 

ATGTAGGAAG AGCTGTTCAG GTTTTGTTCM AAGTGATAGC AAAAGAGAAT CTCTGTGTAA 430 

60 
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TAAAGCAGAT GCTGAAAGTG AACCTGTTGC ACAAAAAAGT CAGCTTGATA GAACTGTCTG 540 

CATTTCTGAT GCTGGAGCAT GTGGTGAGAC CCTCAGTGTG ACCAGTGAAG A-AACAGCCT 600 

TGTAAAAAAA AAAGAAAGAT CVTTGAGTTC AGGATCAAAT TTTTGTTCTG A-.OAAAAAAC 660 

TTCTGGCATC ATAAACAAAT TTTGTTCAGC CAAAGACTCA GAACACAACG AGAAGTATGA 72 Q 

C-GATACCTTT TTAGAATCTG AAGAAATCGG AAGAAAAGTA GAAGTTG7GG AAAGGAAAGA 730 

ACATTTGCAT ACTGACATTT" TAAAACGTGG CTGTGAAATG GACAACAACT GCTCACCAAC 340 

CAGGAAAGAC TTCACTGAAG ATACCATCCC ACGGAACACA GATAGAAAGA AGGAAAACAA 900 

GCCTGTATTT TTGCAGCAAA TATAACAAAG AAGCTCTTAG CCCCCCACGA CGTAAAGCCT 560 

TTAAGAAATG GACACCTCCT CGGTCACCTT TTAATCTCGT TCAAGAAACA GTTTTTCATG 1020 

ATCCATGGAA GCTTCTCATC GCTACTATAT TTCTCAATCG GACCTCAGGC AAAATGGCAA 1080 

TACCTGTGCT TTGGAAGTTT CTGGACAAGT ATCCTTCAGC TGAGGTAGCA AGAACCGCAG 1140 

ACTGGAGAGA TGTGTCAGAA CTTCTTAAAC GTGTTGGTCT CTACGATGTT CGGGCAAAAA 1200 

CCATTGTCAA GTTCTCAGAT GAATACCTGA CAAAGCAGTG GAAGTATCCA ATTGAGCTTC 1260 

ATGGGATTGG TGCACCCTGA AGACGACAAA TTAAATAAAT ATCATGACTG GCTTTGGGAA 13 20 

AATGATGAAA AATTAAGTCT ATCTTAAACT CTGCAGCTTT C-AGCTCATC TGTTATGGAT 1330 

AGCTTTGCAC TTCAAAAAAG CTTAATTAAG TACAACCAAC CACGTTTGCA GCCATAGAGA 1440 

TTTTAATTAG CCCAACTAGA AGCCTAGTGT GTGTGCTTTC TTAATGTGTG TGCCAATGGT 1500 

GGATCTTTGC TACTGAATGT GTTTGAACAT GTTTTGAGAT TTTTTTAAAA T AAATTATTA ' 1560 

TTTGACAACA ATCCAAAAAA AAAAAAAAAA AAAAAAAAAA AAAAAAAAAA AAAAAAAAAA 1620 

AAAAAAAAAA AAAAAAA 1637 
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( 2) INFORMATION FOR SEQ ID NO : 179: 

(i) SEQUENCE CHARACTERISTICS: ' 

(A) LENGTH: 2911 base pairs 
(3) TYPE: nucleic acid 
(C) STRAND EDNESS : double 
■ (D) TOPOLOGY: linear 

Uci) SEQUENCE DESCRIPTION : SEQ ID NO: 179:. 

GGTGGTTTTT GTTCTGCAAT AGGCGGCTTA GAGGGAGGGG CTTTTTOGCC 

GTAGCTTCTC CACGTATGGA CCCTAAAGGC TACTGCTGCT ACTACGGGGC TAGACAGTTA 

CTGTCTCAGC TCTAGGATGT GCGTTCTTCC ACTAGAAGCT CTTCTGAGGG AGGTAATTAA 

AAAACAGTGG AATGGAAAAA CAGTGCTGTA GTCATCCTGT AATATGCTCC TTGTCAACAA 



60 
120 
180. 
240 



WO 98/54963 



PCT/US98/11422 



TGTATACATT CCTGCTAGG? GCCATATTCA 
TGAAGTATTC TGCCAATGAA GAAAACAAGT 

5 

GCTCAGAACT ggtgaagcta gttttctgtg 

ATCATCAAAG TAGAAATTTG AAATATGCTT 
10 GGTCCATTCC TGCCTTTCTT TATTTCCTGG 
ATCTTCAACC AGCCATGGCT GTTATCTTCT 
TATTCAGGAT AGTGCTGAAG ANGCGTCTAA 
. • TATTTTTGTC TATTGTGGCC TTGACTGCCG 
GAC'GTGGATT TCATCACGAT GCCTT7TTCA 
20 ATGAGTGTCC CAGAAAAGAC AATTGTACAG 
GGAACACCAC AGC C\G AGTT TTCAGTCACA 
TAGTCCAGTG TTTTATTTCT TCAATGGCTA 

25 

GGAACCAGCT CACTGAAEGG ATCTTCATAC , 
TGTTTAATGG GCTGACTCTG GGCCTTCAGA 
30 GATTTTTTTA TGGCCACAGT GCATTTTCAG 
GGCTTTCAGT GGCTTTCATT CTGAAGTTCC 
AGGTTACCAC TGTCATTATC ACAACAGTGT 

35 

TGGAATTTTT CTTGGAAGCC CCATCAGTCC 
AGCCTCAAGT TCCGGAATAC GCACCTAGGC 

40 -TTTGGGAGCG TTCCAGTGGG GATGGAGAAG 
ATGAGTCAGA TGAAGATACT TTCTAACTGG 
TTATTTTCAC ATTTTCAGTG TTTGTAATAT 
TTTCTAAATC CTAAXATTGT TTGCATATAT 
GCTTAGAGTA OCC*AAGGCT AAGAAATTCT 

50 GAATTCATTA ATATCTCAGT ACTTGATAAA 
TTGGCCTTCA AGCTTCCAAA AAACTTGTAA 
CATAGAGATC AATTTGCCAA ATATTCACAA 

55 

TTCCCTTTTT AACATTATAA AAGCTAGGTT 
TCATTrXGCA AGTAAAGAGC AACGGGAGCC 
60 TACCTGGCCA TACCATAGAT TTGGGATGAT 



431 



TTGCTTTAAG CTCAAGTCGC ATCTTACTAG 300 
ATGATTATCT TGCAAGTACT GTGAATGTGT 360 
TGCTTGTGTC ATTCTGTGTT ATAAAGAAAG 420 
CCTGGAAGGA ATTCTCTGAT TTCATGAAGT 480 
ATAACTTGAT TGTCTTCTAT GTCCTGTCCT 540 
CAAATTTTAG CATTATAA.CA ACAGCTCTTG 500 
ACTGGATCCA GTGGGCTTCC CTCCTGACTT 660 
GGACTAAAAC TTTACAGCAC AACTTGGCAG 720 
GCCCTTCCAA TTCCTGCCTT CTTTTCAGAA 730 
CAAAGGAATG GACTTTTCCT GAAGCTAAAT 340 
TCCGTCTTGG CATGGGCCAT- GTTCTTATTA 900 

ATATCTATAA TGAAAAGATA CTGAAGGAAG ?60 

.AGAACAGCAA ACTCTATTTC TTTGGCATTC 1020" 

GGAGTAACCG TGATCAGATT AAGAACTGTG 1030 

TAGCCCTTAT TTTTGTAACT GCATTCCAQG 1140 

TGGATAACAT .GTTCCATGTC TTGATGGCCC 12 0Q 
CTGTCCTGGT CTTTGACTTC AGGCCCTCCC . 1260 

TTCTCTCTAT ATTTATTTAT AATGCCAGCA 1320 

AAGAAAGGAT CCGAGATCTA AGTGGCAATC 1330 

AACTAGAAAG ACTTACCAAA CCCAAGAGTG 1440 

TACCCACATA GTTTGCAGCT CTCTTGAACC 1500 . 

TTATGTTTTC , ACTTTGATAA ACCAGAAATG 1560 

CTAGCTACTC CCTAAATGGT TCCATCCAAG 1620 

AAAGAACTGA TACAGGAGTA ACAATATGAA 1630 

TCAGAAAGTT ATATGTGCAG ATTATTTTCC 1740 

XAATCATGTT AGCTATAGCT TGTATATACA 1300 

TCATGTAGTT CTAGTTTACA TCOGAA^jGTC 1360 

GTCTCTTGAA TTTTGAGGCC CTAGAGATAG 122Q 

TTTCTAAAAA CGTTGGTTGA AGGACCTAAA 1580 - 

GTAGTCTGTG CT AAATATTT TCCTGAAGAA 2040 
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GCAGTTTCTC AGA.CACAACA TCTCAGAATT TTAATTTTTA GAAATTCATG GGAAATTGGA 
TTTTTGTAAT AATCTTTTGA TGTTTTAAAC attggttccc tagtcacca.t agttaccact 
TGTATTTTAA GTCATTTAAA CAAGCCACGG TGGGGCTTTT TTCTCCTCAG TTTGAGGAGA 
AAAATCTTGA. TGTCA.TTA.CT CCTGAATTAT TACATTTTGG AGAATAAGAG GGCATTTTAT 
TTTA.TTAGTT ACTAATTCAA GCTGTGACTA TTGTATATCT TTCCAAGAGT TGAAATGCTG 
GCTTCAGAAT CATACCAGAT TGTCAGTGAA GCTGATGCCT AGGAACTTTT AAAGGGATCC 
TTTCAAAAGG ATCACTTAGC AAACACATGT TGACTTTTAA CTGATGTATG AATA.TTAATA 
CTCTAAAAAT AGAAAGACCA GTAA.TATA.TA AGTCACTTTA CAGTGCTACT TCACACTTAA 
AAGTGCA.TGG TATTTTTCAT GGTATTTTGC ATGCAGCCAG TTAACTCTCG TAGA.TAGAGA 
AGTCAGGTGA TAGATGATAT TAAAAATTAG CAnACAAAAG TGACTTGCTC AGGGTCATGC 
AGCTGGGTGA TGA.TAGAAGA GTGGGCTTTA ACTGGCAGGC CTGTATGTTT ACAjGACTACC 
ATACTGTAAA. TATGAGCTTT ATGGTGTCAT TCTCAGAAAC TTATACATTT CTGCTCTCCT 
TTCTCCTAAG TTTCATGCAG ATGAATATAA £GTAATATAC TATTATATAA TTCATTTGTG 
ATATCCACAA TAATATGACT GGCAAGAATT GGTGGAAATT TGTAATTAAA ATAA.TTATTA 
AACCTAAAAA AAAAAAAAAA AAAAACTCGA G 

(2) IMFORMAT ION FOR SEQ ID MO: 130: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH : 519 "base pairs 
(3)/ TYPE : nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 130 : 
GGCACGAGCC CCAGGCCAGC CJ-sZ-GGC CAGG CCTACTTTGG CCACCCTTAA ATTAGAATGT 
GGGGTCAGGG GTCACAGAAA AGCCATTTCT CTGACCTAGT GTTTGGCGTC CGGGAACTCT 
GTGCC'CAACC TTCAGACCCT GGCAGTCCTC ACTGAGGCCA TTGGCCCAGA GCCCGCCATC 
CCCCGARACC CCCGGGAGCC GCCTGTTGCC ACGTCCACAC CTGCCACACC CT CTC-C CGGG 
CCCCAGCCCC TCCCAACCGG GACCGTGGTG GTCCCTGGGG GTCCTGCCCC ACCTTGCCTT 
GGGGAGGCAT GGGCCCTCCT CCTCCCACCC TGCCGGCCGT CACTCACCTC TTGCTTCTGG 
TCCCCCAGGC CTAGCCCTTG GAAGGAGACA GGAGTCTAGG GAGGCTGAAG CCCACTCCCG 
GGGAGGCCCG TGCTCCTCCA GCCCCAGGGA aAGCAAC-GAA AAGAGAAGAG AGCAjGA.GCA.T 
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TTCATGGCTC TAATAAAAAA AA\AAA?AAA AAAACTC3A 



(2) INFORMATION FOR SEQ ID MO: 13 L : 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH : 963 base pairs 
(5) TYPE : nucleic acid 

( u ) TOPOLOGY : linear 

(:ci> SBQUEMCE DESC?.I?TION: SEQ 10 MO: 131: 

TCCCCTTGGG GCCGGAAAAA GCGGGdTTGG CCTGNCCATT GGTTNTCCAT 



CATGCCCCAG TACTAGCCTG CAGTCCCAAT GTAGCCCCTC CCTOTTCCHA GAGCCCYTG4 
AACCGCCCCG STCANTTGTG ATTTCAGGAG GATTTGATGA AGATGTTAAA GCGAAAGTGG 



AGAACCTTCT CGGGATTTCC AGCCTGGAAA AAACC-GACCC TOTTAGGCAA GCACCCTGCA 
GCCCTCCGTG TCCCCTTCTT CCCCTCCCCT TCYCCCGCCC GTGGAGACAG CTGTTYTGAG 
CAGGGCTCTC QGCZGGGZGG GGGCCGGCTC CTTOCCTGGC AGCAACATCC TTGCC CTTGT 



C\CA2AAGTC AGCCTGGATC TGCGCAGCTC TGTGGATGCG CTGCTGGAGG GCAACAGGTA 
TGTCACTGGC TGGTTGAGCC CCTACCACCG CCAGCGGAAG CTCATCCACC CGGTCATGGT 
TCAGCACATC CAGCCCGCAG CGGTCAGCGT GCTGGCAGAG TGGAGCACCC TCGTGGAGGA 
GCTGGAGGGT GCCCTGCAGC TGGCTTTCTA CCCGGATGCC GTGGAGGAGT GGCTGGAGGA 
AAACGTGCAC CCCAGCCTGC AGCGGCTGCA ARCTCTGCTG CAGGACCTCA GCGAGGTGTC 
TGCCCCCCCG CTGGCACCCA CCAGCCCTGG CAGGGACGTT GCTCAGGACC CCTGAGGGGA 
GAGCTCATGC CA3GGGGCTC CTGCTGGAGG CTGGGGGGGC TCTGCWTTK? OvWWTGGCCT 
GGGCAATACG GCCCACGTGG GCGTCGTGCC CTCTGGCCCA GCAGTGTCTT GCCCACACTC 
AGTTCCTGAG GGCCCTGGGC AGCCCCTGGG GGAGAGACTA GnAAACAGAG AAGGAAGCAG 
CACAGGGAGA CCCGCTTTGT GATCTGCATG TGTGACACTG ATTCTTTGGA AATAAAGAGT 
GGAAGCTG 



- 60. 
120> 
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(2) INFORMATION FOR SEQ ID MO: 132: 

(i) SEQUENCE CHAPACTE?- 1 ST IC S : 

/(A) LENGTH; li'23 base pairs 
(3) TY?E: nucleic acid 

( C ) STRAiMDEDMESS : ■ coub le 

(D) TOPOLOGY; 
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(Xi) SEQUENCE DH SCRI PT I C M : SEQ ID NO: 132: 



TGTAAAAGTT ATCAGTAATC CTAATTCTTT TC CTGGGTTT TCCTTTTGTC ACTTATTAAT 60 
5 CAGTTTTTGA AAGGACGAAT GAATTTAGAG ATGTACTCTG GAGCAGTA.TC ATGTTAAACC I2Q 



AGGGGTATAT TAGAAAAA.TC ATOCTCATAA. TCATTCTGGG AAGTTTTTCC TCCCCAAAAA 130 



AAGCCATCCT GATGGGTTTT CAAAA.CCAGA. AAAAAGCTCT TAATGAGGAA CAGA.CCACTG ?dQ 

10 

GAGTACCCAT GAGGATCTCA GGAAAA.CTGA GACCCTCGAG AAGCCTTGAT TTGGTGCAAC ' 300 



GCCGAAGGTT TCAGAGCCAG CAGCCCAGTG GTGTGGTTGA CAGACGTGGT TTTKTGGRGA 350 
15 AAGCAGGGAG AGGGGAGGAA TTTTCAGAGT CGTGAGTCAC GRTYTGCGAC CCAAGATTAG 420 
'AGCAMAGATT AGCCATACTG AGATTTGGTA AAATCATTCT GTCTAAGCAA TGGAGGTGTG 430 



TGCAMACGTG CAGTGCCTGT TCAjCAGGGGA. TGCAGGGAGA TCSYGGGTTT AGaATC-GGGR ' S40 

20 

AGGG CACCGC ACCCCCmTC AYTGCPCTGC ACCTGCTCCC TCACGTGGAC ACTGTCCACA 50 Q 



AGTGTGGGTC TCA.GAGGAGA GTTGGCCAAG GAGCT CAT AT CTTA.TTGGAG ATAGGGGGTC 550 
25 GTAGAjGGTGA.'CATTCA.TCAG CAGTGTGAGC CGGGTGACAT GGGGGTGTCA- ACCCAGCATC 720 



TGTCCAGGAG CTCCTGCTGC AGCGGCTCTG GCAGGTGGCC TGAGGCTCCT TTTTGAGAGA 780 



GAACTGTTTG GCCTTCCTGT CTCCTCTCCT CTGA.TCTGTT CTTTCTTGGA ACACCACCCA 840 

30 

AGAACGTCAC CTCCTCCATC AGATTGTGAG CTCCTGGAGG GO A.GG AGC TG TGTCCTTCTA. 90Q 

TTCATCTTCC TATCCCCAGA ACCTTGCACA GATCCTGGAA TGTGGTAGGT GCTCAGTAAA 560 

35 TGTGTGTTGA ATAAATGAAT GAATGAATGA ACAAA-TGAAT GAATTTGCTT ACTTCAAGGG 1020 

AAAAGAACCA TGAAACTGTA TTTTGAGTTT CTATGTTATA GCAGTCAGCA AA.TCCTATTA 10 SO 



AATACTTTGT GTTTCCAAGC AAAAAAAAAA AAAJ^AAAAAA AAACTCGA 1123 

40 



(2) INFORMATION FOR SEQ ID MO: 133: 

45 

( l) SEQUENCE CKA-RACTH?J:STICS : 

' (A) LENGTH:. 2276 base pairs 
(3) TYPE: nucleic acid 
<C) -STHANDEDNE3S : double 
50 (D) TOPOLOGY: linear 



(xx) SEQUENCE DESCRIPTION : SEQ ID NO: 133: 



CCGCGC-GGT.C TGACCTCATG GCGTAGAGCC TA.GCAA.eAGC GCAGGCTCCC AGCCGAGTCC . 60 

55 

■ GTTATGGCCG CTGCCGTCCC GAAGAGGATG AGGGGGC-CAG CACAAGCGAA ACTGCTGCCC 120 

GGGTCGGCCA TCCAAGCCCT TGTGGGGTTG GCOCOGCCGC TGGTCTTGGC GCTCCTGCTT I3.G 

60 GTGTCCGCCG CTCTATCCAG TGTTGTATCA CGGAGTGATT CACCGAGCCC AACCGTACTC 240 
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AACTCACATA TTTCTACCCC AAATGTGAAT GCTTTAACAC ATGAAAACCA AACC1AAACCT 300 

TCTATTTCCC AAATCAGCAC CACCCTCCGT CCCACGACGA. GTACCAAGAA AAGTGGAGGA 3 SO 

GCATCTGTGG TCCCTGATCC CTCGCCTACT CCTCTGTCTC AAGAGGAAGC TGATAACAAT 420 

GAAGATCCTA GTATAGAGGA GGAGGATGTT CTCATGCTGA ACAGTTCTCC ATCCACAGCC 430 

AAAGACACTC TAGACAATGG CGATTATGGA GAACCAGACT ATGACTGGAC CACGGGGCCC 540 

AGGGACGACG ACGAGTCTGA TGACACCTTG GA^GAAAACA GGGGTTACAT GGAAATTGAA 500 

GAGTCAGTGA AATGTTTTAA GATGCCATCC TCAAATATAG AAGAGGAAGA CAGCCATTTC 660 

TTTTTTCATC TTATTATTTT TGCTTTTTGC ATTGCTGTTG ■ TTTACATTAC ATATGACAAC 720 

AAAAGGAAGA TTTTTCTTCT GGTTCAAAGC AGGAAATGGC GTGATGGCCT TTGTTCCAAA 730 

AGAGTGGAAT ACCATCGCCT AGATCAGAAT GTTAATGAGG CAATGGCTTG TTTGAAGATT 340 

ACCAATGATT ATATTTTTTA AAGGACTGTG ATTTGAATTT GGTTATGTAA TTTTATTTCG 900 

TTGACTTTTT ATATGATATT GTGCAAATGT TTGCCATAGG CAATTGGTAC TTAAATGAGA " 960 

GGTGAGTCTC TCTTTTGCC7 TGGTGCTTTG'. 'GAAATTAAAT GTCACAAACG AGTATATAAT 1020 

TTTTTATCTG TACTTTTAGA GCTGAGTTTA ATCAGGTGTG CAAAATGTGA GTTAAACATT " 1080 

ACCTTATATT TACACTGTTA GTTTTTATTG TTTTAGATTT ATTATGCTTC TTGTGGAAGT 1140 

ATTAGTGATG CTACTTTTAA AAGATCCCAA ACTTGTAAGT AAATTCTGAC ATATCTGTTA 1200 

CTGGTGACTC AC^TTCATTC TGCGCCATTC AAATACTATT TTTTATCCAC ATTTTTTTTT 1250 
GTTCCCAAAC TGTAATGTAC AAGGATATGT GTGATAATGG TTTGGATTTG AGTAATATTT . • 1320 
TTTTTTGTTC CAAGAAAACT GGTTTGGATA TTTTTAGATA ATTTAAACAT AATTTAGGAT - 1330 

AATGATATTG CTCAATCTGA CCACAATTTT AGGTAAAACA TTAAATGTGT CAGAAATCTT 1440 

GGCAACAGAG ACTCTGCAGC TTGCAGTGGA CATAGATAA\ . ATGTTACAGA GATACTATTT 1500 

TTTTGGTTGG AATTACTATA TTAAATTTAG AAGCAGAAAC TGGTAAA-.TG TTAAATACAT 15*50 

GTACAATTGC TTTTAGTTAG CAATTGATTG TAGCATGGGT TGCTCCAAGG TTTCAAGCAA 1520 

TGGGCAGAGT TTAAAATTAT ATCAGATTCG TTTACTTCGT TTATTATTTT ACAGTAAATT . 1530 

TGAATAAATC TTAGGGGTCA TTATCACTTA AATAATACTG TACCTAGGTC TTTCAAATTA 1740 
AAATTATACC TGAATGAAGT TGTTTGTATA CATAAAGGAT ATTTGTGTAC AATTACCTTT . 1300 

TTTGGGCCAC ACTTGTTTTC TTTGTTTTTG TTTTTTATGG CAACTGGAAA GTATTTACTA 1360 

TGGGATTGAT TTATGTCTGT CTTTCTATCA TAAAGAA.TTG ATGAATATGT AAATATGTGA 1920 

TTTGAACCAT GGTTGACTTA GAAGTGTCAG TACAGCTTTT TAGAAAACAT AGGCCTAATA 1980 

TATGTTA\GC AGGAGCCGGG TGAGCGAGTG GGGTTGCGCT TTATGTAGAG GTGG-AG-AG 2040 
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GCCGTCCATC CTGTCTCTTG GGCGGACAGT GTACTTTCCT AATAGGGAAG GGAAGCACAA 2100 

TGGAAATACC CCTGAACCGT TTTATTGCAG TAA.TTTTTTT CATA.TCTGAA ACTATTATTT 2150 

5 

AATA.TTTTGA ATAAGATTTT AAAAAATAAA TGGCrtAAGAT ATAAATCTAA AAAAAAAAAA 2220 

AAAAAAAAAA AAAAAAAAAA AAAAAAAAAA AAAAAAnAAA AMAAAAAAA AAAAAA 2276 

10 



(2) INFORMATION FOR SEQ ID MO: 134: 

15 (1) SEQUENCE CHRMCIERISTXCS : 

(A). LENGTH : 2500 base pairs 
(3) TYPE: nucleic acid 

(C) STRANDEDMES5 : double 

(D) TOPOLOGY: linear 

20 

(;ci) SEQUENCE DESCRIPTION: SEQ ID MO: 13 4: 

TCCAAGCTAC GCCACTCGGG CTGGGGCGTT GGGAGCGGGA GTGCAGAGCG TGGTCGTGGC 60 

25 GGCGGCGGTG AGAAGAGCGA GGCGKAGGAG GGGGTGCCAT GGCCGGGCAG CAGTTCCAGT 120 

ACGATGACAG TGGGAACACC TTCTTCTACT TCCTCACCTC CTTCGTGGGG CTCATCGTGA 13 Q 

TCCCGGCGAC ATACTACCTC TGGCCCCGAG ATCAGAATGC CGAGCAAATT CGATTAAAGA 24Q ' 

30 

ATATCAGAAA AGTATA.TGGA AGGTGTATGT GGTACGTTTA CGGTTATTAA AACCCCAGCC 300 

AAATATTATT CCTACAGTAA AGAAAATAGT TCTGCTTGCA GGATGGGCAT TGTTCTTATT 360 

35 CCTTGGATAT AAAGTTTCCA AAACAGACCG AGAATACCAA GAATACAATC CTTATGAAGT 420 

ATTAAATTTG GATCCTGGAG CCAQAGTAGC AGAAATTAAA AAACAATATC GTTTGCTGTC 430 

ACTTAAATAT CATCCAGATA AAGGAGGTGA TGAGGTTATO TTCATG.AGGA • TAGCAAAAGC 540 

40 

TTATGCTGCT TTAACGGATG AAGAGTCCCG GAAAAATTGG GAAGAATTTG GAAATCCAGA 600 

TGGGCCTCAA GCCAGAAGCT TTGGAATTGC CCTGCCAGCT T.GGATAGTTG ACCAGAAAAA 660 

45 TTCAATTCTG GTTTTACTTG TATATGGATT GGCATTTATG GTTATCCTTC CAGTTGTTGT 720 

GGGCTCTTGG TGGTATCGCT CAATACGCTA TAGTGGAGAC CAGATTCTAA TACGSACAAC 730 

ACAGA.TTTA.T ACATACTTTG TTTATAAAAC CCGAAATATG GATATGAAAC . GTCTTATCAT 840 

50 

GGTTTTGGST GGAGCTTCTG AATTTGATCC TCAGTATAAT AAAGATGCCA CAAGCAGACC S00 

AACGGATAAT ATTCTAATAC CA.CAGCTAAT CAGAGAAATT GGCAGCA.TTA ATTTAAAGAA 96 0 

55 GAATGAGCCT CCACTTACCT GCCCATATAG CCTGAAGGCC AGAGTTCTTT TACTGTCTCA 1Q20 

TCTTGCTAGA ATGAAAATTC CTGAGACCCT TGAAGAAGAT CAGCAATTCA TGCTAAAAAA 10S0 

GTGTCCTGCC CTACTTCAAG AAATGGTTAA TGTAATCTGC CAA.CTAATAG TAATGGCCCG 1*1 4 G 

60 
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GAACCGTGAA. GAAAGGGAGT TTCGTGCTCC AACTTTGGGA TCCC~AGAAA ACTGCATGAA 1200 

GCTTTCTCAC- ATGGGCGTTC AGGGACTTGA GGAATTTAAG TCTCCCCTTC TGGAGGTGGC 12 SO 

5 TGATATTGAA CAJGGACAATC TTAGACGGGT TTCTAATCAT AAGAAGTATA AAATTAAAAC 1220 

TA.TCCAGGAT TTGGTGAGTT TAAAAGAATC AGATCGTCAC ACTCTACTGC AC TTC CTTGA 1330 
AGATGAAAAiC TA.TGAAGAGG TTATGGCTGT CCTTGGGAGT TTTCCATATG TGACCATGGA . 1440 

TATAAAATCA CAGGTGTTAG ATGATGAAGA TAGCAAGAAC ATCACAGTAG GATCCTTAGT 1500 
TACAGTGTTG GTTAAGTTGA CAAGGCAAAC AATGGCTGAA. GTATTTGAAA AGGAGCAGTC . 1560 

15 CATCTGTGCT GCAGAGGAAC AGCCAGCAGA AGA1GGGCAG GGTGAAACTA ACAAGAACAG 1520 

GACAAAAGGA GGATGGCAAC AGAAGAGTAA AGGACCCAAG AAAACTGCTA AATCAAAAAA 1630 

AAAGAAACCT TTAAAAAAAA AACCTACACC TCTGGTATTA CCACAGTCAA AGCAAGAGAA 1740 

20 

AGAAAAGGAG GCAAATGGAG TCGTTGGGAA TGAAGCTGCA GTAAAGGA.AG A.TGAAGAAGA 1300 

AGTTTCAGAT AAGGGGAGTG ATTGTGAAGA AGAAGAAAGG AATAGAGATT CGCAAAGTGA 1360 

.25 GAAAGATGAT GGTAGTGACA GAGACTGTGA TAGAGAGGAA GATGAAAAAC AAAACAAAGA 1920 

TGATGAAGGA GAGTGGGAAG AATTACAACA '.AAGCATAGAG CGAAAAGAGA GAGGTCTATT 1980 • 

GGAAACCAAA TGAAAAATAA CACATGCTGT GTATAGGCTT TACTTTCCTG AGGAAAAACA 2040 

30 

AGAATGGTGG TGGCTTTACA TTGCAGATAG GAAGGA.GGAG ACATTAA.TAT CCATGGCATA 2100 

TCA.TGTGTGT ACGCTGAAAG ATACAGAGGA. GGTAGAGGTG AAGTTTCCTG CACCAGGCAA 2160 

35 GGGTGGAAAT TATCAGTATA GTGTGTTTCT GAGATGAGAG TCCTATATGG GTTTGGATCA 2220 

GATTAAACCA TTGGAAGTTK GGAAGTTCAT GAGGCTGAAG CCTGTGCCAG AAAATCAGCG 2230 

ACAGTGGGAT ACAGCAATAG AGGGGGATGA AGAGGAGGAG GACAGTGAGG GGTTTGAAGA 2340 

40 

TAGGTTTGAG GGAGGAAGAG GGAGGGAGGA AGGAAGGTGG TGGACTTAAG GGAGTTAGTC 2400 

TGGAATGGGA CCCACAGTGT TTTGCACGAT ATTTTGGCAA TTTTTTTTGC CCGTTTTTMG 2460 

45 GAAGTGTTTT CCOTNAANCC CAGGAACGAT TACAGAA.CCG 2500. 



50 (2) INFORMATION FOR SEQ ID NO: 135: 

(i) SEQUENCE CHAPACTERI3TIGS : 

(A) LENGTH: 1337 base pairs 
(3) TY?E: nucleic acid 
55 (C) STPA-lMDEDiMESS : double 

AD) TOPOLOGY: linear 

(xi)' SEQUENCE DESCRIPTION: SEQ ID NO: 135: 

60 CTTCCGGTTC TCCGGGCAGC TGGCACTGCT GTAGGTTCTG CCACCTGCCA CGACCGCGGC' 60 
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10 



15 



20 



25 



30' 



35 



40 



45 



TCTCCCTGGC 
GCCAGCGTGG 
GCCTCGCAGT 
CTGCCTGAAC 
CTTGGGCCTC 



AATGGCAGCA 

GGCTCGGTGG 

CGGGCCCTGA 

ACGGTCAGGA. 

ATAGAAAATA 

TTTGATGCCA 

CTACAATGAA 

GGGGGTATTT 

TATCCTTTTA 

GGGGATAAAG 

TGATCGGTTT ' 

TAATGCTAAT 

TGGTGTGTAA 

AAAAGAAGCC 

TTATTGTTTG 



GTTTGGTCAC 
CGGGCCTGGC 
GCTGCTGCTG 
TAAGCGGGYC 
- CTGACCCTAG 
CGGGCCGTGG 
AGCCTGTGGC 
GTGGCGGCCT 
CCGTGTTGAT 
TGAGAAGAAG 
TGGAATTGAC 
ATCATCCTCG 
GAGTGGAATT 
AAGTTACATA 
TTATATCTTT 
TATACTGTAA 
TTGTTTCCTG 
TATTTTTGCT . 
AAATAATTTA 
AAATTTATTA 
G^iAAACAATA 
GCCCGGM 



CTCTGCTTCA TTCTCCACCG CGCCTATGGT 
GGCTCCCGGG TGGTGAGAGA GCGGTCCGGG 
TCTOGCCAC CTCTTGGCTT CCGTCCTCCT 
CCTGGMAGTC CTGCTGCAGG CAGCCGAGGC 
ACCACGGACA TTACCGGCGG TGCCACCC-GG 
TCTGGGTGAA GCTGCGGGGC CGCGGGGCTC 
CGGGCTTGAG ACGGACGATC ACGGAGGGAA 
TGCTG TGAGC CG CAAGCGTG GCGACAAGCG 
GGTGGTGAGC GGCGCGGTGC TGGTGTAGTT 
AAACCGAAAG ACTAGGAGAT ATGGAGTTTT 
ACCTTTAGAA OGGATGATG AGGATGATGA 
AAGATAAGAA TGTGCCTTTT GATGAAAGAA 
TGTATGTTTA AGGAATAAGA AGCCACTATA 
TATTTTAACA ACGTTTAATT TGCTGTTGCA 
ATATGTATAG AAGTACTCTR TTAATGGGCT 
TAATTTATCT GTTTGAAAAT TACTATAAAA 
CTT AC CAT AT GATTGTAAAT TGTTTTATGT 
GATGTCATAT GTTAAAGAGC TATAAATTCC 
AAATTTCCTT TACTGAAAGG TATTTCCCAT 
CTTTGTGTTG GGGTTTTTAA AATATTAAGA 
AATATGATTT TAAATTCTCT TAAAAAAAAA 



CCCTCTTGGA 
AAC GATGAAG 
CCTGCTGTTG 



CCCTACCCCT 
CGAGGGAGGC 
GGCCGGGGAA 
CATGACCCAG 
CGTGGTCAGG 
GGP..G\CTAAC 
CAA.CACGTTG 
CTTTATCTTT 
TCAATGTTGG 
A.TAAATACCG 
CAGAGATGTT 
CGGTGTTTTC 
ATTAATCAGT 
AACAACCAAC 
TTTTGTGGGG 
AATGTCTAAG 
AAAAAAAACC 



120 
130 
240 
300 
2 60 
420 
430 
540 
600 
560 
720 
730 
340 
900 
960 
1020 
1080 
1140 
1200 
1250 
1320 
1337 



50 



55 



60 



(2) IMFORMAT ION FOR SEQ ID MO: 136: 

(i) SZQUE^ICZ C-ZA^ACTERISTICS : 

(A) LENGTH: 941 base pairs 
(3) TYPE: nucleic acid 
CO STRAMDEDNES3 : double 
(D) TGFOLCGY : linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 136: 

GGCACGAGCC TGGACGCAGC AGCCACCGCC GCGTCCCTCT CTCCACGAGG CTGCCGGCTT 



S.0 
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:CA GCTCCGACAT GTCGCCCTCT GGTCC-CCTG? GTCTTCTCAC CATCGTTGGC 120 



10 



CTGATTCTCC CCACCAGAGG ACAGACGTTG AAAGATACCA CGTCCAGTTC TTCAGCAGAC 130 

TCAACTATCA TGGACATTCA. GC-TCCCGACA CGAGGGCCAG ATGCAGTCTA O-CAGAACTC 240 

CAGCCCACCT CTCCAACCCC AACCTGGCCT GCTGATGAAA. CACGACAACC CCAGACCCAG 300 

ACCCAGCAAC TGGAAGGAAC GC-ATGGGCCT CTAGTGACAG ATCCAGAGAC ACACAAGAGC 350 

ACCAAAGGAG CTCATCCCAC TGATGACACC ACGACGCTCT CTGAGAGACC ATCCCCAAGC 420 

ACAGACGTCC AGACAGACGG CCAGA.CCCTC AAGCCATCTG GTTTTC ATGA GGATGACCCC 430 

15 TTCTTCTATG ATGAACACAC CCTCCGGAAA CGGGGGGTGT TGGTCGCAGC TGTGCTGTTC 540 

ATCA.CAGGCA TCATCA.TCCT GACCAGTGGC AAGTGCAGGC AGCTGTCCCG GTTATGCGGG oQO 

AA.TCATTGCA GGTGAGTCCA TCAGAAACAG GAGCTGACAA CCYGCTGGGC ' ACGCGAA.GA.C 660 

20 

CAAGCCCCCT GCCAGCTCAC CGTGCCCAGC CTCCTGCATC CGGTCGAAGA GCCTGGGCAG . 720 

AGAGGGAAGA CACAGATGAT GAAGCTGGAG CCAGGGCTGC CGGTCCGAGT CTCCTACCTC 780 

25 ' CCCCAACCCT GGCCGCCCCT GAAGGGTACG TGGCGCCTTG GGGGCTGTCC CTCAAG7TAT 340 

CTCCTCTGYT AAGA.CAAAAA. GTAAAGCAGT . 'GTGGTCTTTG CAAAAAAAAA AAAAAAAAAA. $QQ 



30 



AAAAAAAAAA. AAAAAAAAAA AAAAAAAAAA AAAAAACTCG A 541 



(2) INFORMATION FOR SEQ ID NO: 137: 

35 - 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: -654 base pairs 

(B) T/PE: nucleic acid 

(C) STHANDEDNESS: double 
40 (DJ TOPOLOGY: linear 

IxL) SEQUENCE DESCRIPTION:. SEQ ID MO: 137: 

GAATTCGGCA CGAGGCAGCT TGTGCXTTAA AGGAGGTGTT CAAAGCATGT CTGAGCAGAG 60 

45 

ACTTTTGGGC TCTGTTTTAA TTAATACTTT AAAA.TAATTC ATATTTAAAA TATCAPA.TGT 120 

TTCCATAAAG AGGAGG^.TGT TTAAATGCCT CCA.GACTACA TTCCTTTTTA TTSCTTGATT 130 

50 TTACCTGGGA GTCCAAAGTT CAATTCCCAT AAACCAAGCG TTTTATTTGT CACTTTCAAT ■ 240 

ATACATCCGA TTGCCATGCT TAA.GATGCAA TATGGGCTGC GGAAA.TAGGT TAACCCACAG 300 

GCTCCCAGGG CCCAGTGTAG AAGGTGAGAG ATTCGTGTAA AATGATTCAA ATAAAAGGAA. 360 

55 

GACCGTGGCC GGGTGCCGTA RCTCACGCCT GTAATCCCAG CACTTTGGGA GGCCGAAGCG 420 

AGTGGATGAC GAGGTTAGGA GTTGGAGACC AGCCTGGCCA ACATCGTGAA ACCCCGTCTC 480 

60 TACTAAAAAT ACAAAAATTA GCCGGGCATG GTGGCAGGCA CCTGTAATCC TAGCTAGTTG 540 



WO 98/54963 



440 

GGAGGCTGAG GCAGGAGAAT CGTTTGAA.TC TGCCAGTTC-G AGGTTCTCAG TGAGCTGAGA 
TCGCGCCACA GCACTCCAGC CTGGGTGACA GGGTGAGACT CTGTCTCAAA MAGA 

(2) INFORMATION FOR SEQ ID NO: 133: 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 1343 base pairs ' 
'(B) TYPE: nucleic acid 

(C) STRANDEDNESS : dcuble 

(D) TOPOLCGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID iVO: L33 : 
GAAACTGGAC CGGAGAACCG GAGCGAAGC G AAGCGGAAGC CCC-GAATGAG GCCGGACTGG 
AAAGCCGGAG CGGGGCCAGG CGGGCCTCCC CAAAAGCCTG CCCCTTCATC CCAGCGGAAA 
CCGCCGGCZC GGCCG AGC GC GGCGGCCGCT GCGATTGCAG TCGCGGCGGC GGAGGAAGAG 
AGACGGCTCC GGCAGCGGAA CCGCCTGAGG CTGGAGGAGG AIAAACCGGC CGTGGAGCGG 
TGCTTGGAGG AGCTGGTCTT CGGCGACGTC G AGAACGACG AGG ACGCGTT GCTGCGGCGT 
CTGCGAGGCC CGAGGGTTCA AGAACATGAA GACTCGGGTG ACTCAGAAGT GGAGAATGAA 
GCAAAAGGTA ATTTTCCACC TOAAAGAAG CCAGTTTGGG TGGATGAAjGA AGATGAAGAT 
GAGGAAATGG TTGACATGAT GAACAATCGG TTTCGGAAGG ATATGATGAA AAAIGCTAGT 
GAAAGTAAAC TTTCGAAAGA CAACCTTAAA AAGAGACTTA AAGAAGAATT CCAACATGCC 
ATGGGAGGAG TACCTGCCTG GGCAGAGACT ACTAAGCGGA AAACATCTTC AGATGATGAA 
AGTGAAGAGG ATGAAGATG A . TTTGTTGCAA AGGACTGGGA ATTTCATATC CACATCAACT" 
TCTCTTCCAA GAGGCATCTT GAAGATGAAG AACTGCCAGC ATGCGAATGC TGAACGTCCT 
ACTGTTGCTC GGATCTCCAT CTGTGCAGTT CCATCCCGGT GCACAG-ATTG TGATGGTTGC 
TGGGATTAGA . TAATGCTGTA TCACTATTTC ASGTTGATGG GAAAAC AAAT - CCTAAAATTC 
AGAGCATCTA TTTGGAAAGG TTTCCAATCT TTAAGGCTTG TTTTAGTGCT AATGGGGAAG 
AAGTTTTAGC CACGAGTACC CACAGCA.AGG TTCTTTATGT CTATGACATG CTGGCTGGAA 
AGTTAATTCC TGTGCATCAA GTGAGAGGTT TGAAAGAGAA GATAjGTGAGG AGCTTTGAAG 
TCTCCCCAGA TGGGTCCTTC TTGCTCATAA ATGGCA.TTGC TGGATATTTG CATTTGCTAG 
CAATGAAGAC CAAAGAACTG ATTGGAAGCA TGAAAATTAA TGG AAGGGTT ' GCAGCATCCA 
CATTCTCTTC AGATAGTAAG AAAGTATACG CCTCTTCGGG GGATGGAGAA GTTTATGTTT 
GGGATGTGAA CTGAAGGAAG TGCCTTAACA GATTTGTTGA TGAAGGCAGT TTATATGGAT 
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TA&GCATTGC CMLATCTAGG AATGGACAGT ATGTTGCTTG TGGTT.CTAAT TC-TGGAGTGG 1320 

TAAATATATA CAATCAAGAT TCTTGTCTCC AAGAAACAAA CCCAAAGCCA ATAAAAGC7A 1330 

TAATGAACTT COTTACAGGT GTTACTTCTC TGACCTTCAA TCCTACTACA GAAATCTTGG 1440 

C-ATTOCTTC AGAAAAAAXG A-ACAAGCAG TCAGATTGGT TCATCTTCCT TCCTGTACAG 1500 
TATTTTCAAA CTTCCCAGTC ATTAAAP-ATA AGAATATTTC TCATGTTCAT ACCATGGATT - 1560 

TTTCTCCGAG AAGTGGATAC TTTGCCTTGG GGAATGAA-A GGGCAAGGCC CTGATGTATA 1520 

GGTTGCACCA TTACTCAGAC TTCTAAAGAG ACTATTTGAA GTCCAGTTGA GTCACAAJGAG 1530 

15 AAGCCTGTCT TGATATATCA TCTCAGPAAC TTTCCTGAAT ATGTGATAAT ATATGGAAAA 1740 

TGATTTATAG ATCCAGCTGT GCTTAAGAGC CAGTAATGTC ' TTAATAAACA TGTGGCAGCT 1300 

TTTGTTTGAA AAAAAAAAAA .AAAAAAAAAA AAAAAAAAAA ■ AAACTCGA 1343 

20 



10 



25 



35 



(2) INFORMATION cOR SEQ ID NO: 139: 



{ i ) • SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 1145 base pairs 
(3) TYPE: nucleic acid 
(CT STRANDEDNE5S : double 
30 (0) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ 10 MO: 139: 

AAAAAAAACC CAGGGGAACN TTGGGGGCCG CTTTNNNTTC CCCCTCCAGG CCATTGGGGA 60 
ATTCTTCAAG TTAATCCTGO TTTGCTCTTG GCCAAGAGGG CTTGTAGGGG GGAGAGACCC- - 120 

AGGATCATCA AGGGGTTCGA GTGOAGCCT CA2TCCCAGC GCTGGCAGGC AGCCCTGTTC ' 130 

40 GAGAAGACGC GGCTACTCTG TGGGGCGACG CTCATCGCCC CCAGATGGCT CCTGACAGCA 240 

GCCCACTGCC TCAAGCCCCG CTACATAGTT OVCCTGGGGC AGCACAACCT CCAGAAGGAG 300 

GAGGGCTGTG AGCAGACCCG GACAGCCACT GAGTCCTTCC CCCACCCCGG CTTCAACAAC 360 

45 

. AGCCTCCCCA ACAAAGACCA CCGCAATGAC ATCATGCTGG TGAAGATGGC ATCGCCAGTC 420 

TCCATCAC-CT GGGCTGTGCG- ACCCCTCACC CTCTCCTCAC GCTGTGTCA.C TGCTGGCACC 430 

50 AGCTGYCTCA TTTCCGGCTG GGGCAGMACG TCCAGCCCCC AGTTACGCCT GCCTCACAOC 540 

TTGSGATGCG CCAACATCAC CATCATTGAG CACCAGAAGT GTGAGAA.CGC CTACCCOGGC 600 

AACATCACAG ACAGCATGGT GTGTGCCAGC GTGwvGGAAG GGGGCAAGGA CTCCTGCCAG 560 
55 ' * 

GGTGACTCCG GGGGCCCTCT GGTCTGTAAC C-GTCTCTTC AAGGCATTAT CTCCTGGGGC 720 

CAGGATCCGT GTGCGATCAC CCGAAAGCCT GGTGTCTACA CGAAAGTCTG CAAATATGTG 730 

60 GACTGGATCC A.GGAGACGAT CAAGAACAA.T TAGACTGGAC CCACCCACCA CAGCCCATCA 340 
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CCCTCCATTT C-CACTTGC-TG TTTGGTTCCT GTTCACTCTG TTAATAJ^JGAA ACCCTAAGCC 900 

AAGACCCTCT ACGAACATTC TTTGGGCCTC CTGGACTACA GGAGATGCIG TCACTTAATA 950 

ATCAACCTGG GGTTCGAAA.T CAGTGAGACC TGGATTCAAA TTCTGCCTTG AAATATTGTG 1020 

ACTCTGGGAA TGACAACACC TGGTTTGTTC TCTGTTGTAT CCCCAGCCCC AAAGACAGCT 1080 

CCTGGCCATA TATCAAGGTT TCAATAAATA TTTGCTAAAT GAAAAARAAA AArAAAAAAA 1140 

ACTCGA U46 



(2) INFORMATION FOR SHQ ID MO: 130: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 906 base -pairs 
(3) TYPE: nucleic acid 

( C ) STRANDEDNE33 : double 

(D) TOPOLOGY : linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID MO : 190: 

ACTCCCTCAC CCAGGTCCCA GCCCTGGGAA '' CCACCTACCG TGAGCCCTTT TGCAGATATA 60 

GACTCATTTC ATCCTCAGAT GGTCCTTCAA GGTAGGTACT TTAGTCCCAT TTTAGAGATG 120 

AGACGATTGA GGCCAGAGGG GTGNNGTAAC TTGCCTGGGG GCTCACGAGC ACAAAAGGAG 130 

CCGAGGCAGG ATCTGACCCT TGTTCTCTGG CCTCACTGCC CTCACTTTGC CATGACCCGA 240 
AGTTATGTCC CTACAAAGCA ATGCATGGTC CAAGGYTCTT TTTA T T GT AT TTTTATTTTT ' 300 

AAGGGTCCTG TTCAAAACTG GTGTGAGCTC TGAGGAGTCC TGAACCCTGG GTGCAGCATC ' 3 60 

CTAGCATCCT GGGAGTCCTT TTCTGCCCAC ACTGAGCTGG GCTCCTCGAG GGGTGGGGCT 420 

GCTGTCCCTG GAAGCCTGGC AGCAGCACTG TATCGGGTTG GCTGAAGCTG AHCGCCGTGG 430 

GGTGCAGGGC TCCMGGAATC CCCGTTTGGC TGAAGGGGTT CCCTGTAGCC MGGGATGTTT 540 

ATGAGGTCTC TCTGATGCCC CAGGCGCAGG ACATGTGTGC GGGTGGAGAA . AAGCAGGCCC 600 
TTTCAGTGCC AGCTCCACTC AATTTCTATG " TGGACCAAGA ACGATAAACT TAAAAAATTT • 660 
TTTTTCCTAA GGTATCTTCA GAATATGGTG TATTTTTATG TGGAAAAGAA AAGTTATGAA • 720 

GGCAGCTGTT ACTTTAAGAG AAAATTCATT AAAAGTCCTC GAGGTATGAA GATGACGGCG 730 

TGCTTCTCAA TCATTTTGGC ATAACTTGAT TGTGGCTGTA ATT T TTTTTT "TTTTT T T T GT 340 

CAAGCATGTC AGACAATAAA GTCTTTGTAA AAAGRGAAAA AAAAAAAAAA AAAAAAAAAA 900 

ACTCGA Q 06 
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(2) INFORMATION rCE SEQ ID NO: 131: 

(i) SEQUSMCE O-^RACTEEISTXCS : 

(A) LENGTH: 1341 biise pairs 
(3) TYPE: nucleic acid 
(CJ STRANDEDNES3 : double 
(0) TOPOLOGY : linear 

(:<i) SEQUENCE DESCRIPTION : SEQ ZD NO: 191: 

CTTCAGCTGA AGCCCAGGGA CGCCTTTTCC ACCCTGGGCC CCAATGCCGT CCTTTCCCCG 

CAGA.GACTGG TCTTGGAAAC CCTCAGCAAA CTGAGCATCC AGGACAA.CAA. TGTGGACCTG 

AITCTGGCCA CACCCCCCTT CAGCCGCCTG GAGAAGTTGT ATAGCACTAT GGTGCGCTTC 

CTCAGTGACC GAAAGAAGGC GGTGTGGGGG AGATGGCTGT GGTACTGGTG GCCAACGTGG 

CTCAGGGGGA CAGCCTGGCA GCTCGTGCCA TTGCAGTGGA GAAGGGCAGT ATCGGCAACC 

TCCTGGGCTT CCTAGAGGAC AGCCTTGGGG CCACACAGTT CCAGCAGAGC CAGGCCAGCC 

TCGTGGACAT GGAGAACGGA CCCTTTGAGC CAAYTAGTGT GGACA.TG~.TG CGGCGGGCTG 

CCCGCGCGCT GCTTGCCTTG • GCCAAGGTGG ACGAGAACCA CTCAGAGTTT ACTCTGTACG 

AATCACGGCT GTTGGACATC TCGGTATCAC CGTTGATGAA CTGAKTGGTT TC^AAGTCA 

TTTGTGATGT ACTGTTTTTG NATTGGCCAG TCATGACAGC CGTGGGACAC CTCCCCCCCC 

CGTGTGTGTG TGCGTGTGTG GAGAACTTAG AAACTGACTG TTGCCCTTTA TTTATGCAAA 

ACCACCTCAG AATCCAGTTT ACCCTGTGCT GTCCAGCTTC TCCCTTGGGA AAAAGTCTCT 

CCTGTTTCTC TCTCCTCCTT CCACCTCCCC TCCCTCCATC ACCTCACGCC TTTCTGTTCC 

TTGTCCTCAC CTTACTCCCC TCAG3ACCCT ACCCCACCOT CTTTGAAAAG ACAAAGCTCT 

GCCTACATAG AAGACTTTTT TTATTTTAAC CAAAGTTACT GTTGTTTACA GTGAGTTTGG 

GGAAAAAAAA TAAAATAAAA ATGGCTTTCC CAGTCCTTGC ATCAACGGGA TGCCACATTT 

CATAACTGTT TTTAATGGTA AAAAAAAAAA AAAAAAATAC AAAAAAAAAT TCTGAAGGAC- 

AAAAAAGGTG ACTGCTGAAC TGTGTGTGGT TTATTGTTGT ACATTCACAA TCTTGCAGGA 

GCCAAGAAGT TCGCAGTTGT GAACAGACCC TGTTCACTGG AGAGGC CTGT GCAGTAGAGT 

GTAGACCCTT TCATGTACTG TACTGTACAC CTGATACTGT AAACATACTG TAATAATAAT 

GTCTCACATG GAAACAGAAA ACGCTGGGTC AGCAGCAAGC TGTAGTTTTT AAAAATGTTT 

TTAGTTAAAC GTTGAGGAGA AAAAAAAAAA AGGCTTTTCC CC'GAAAGTAT CATGTCTGAA 

CCTACAACAC CCTGACCTCT TTCTCTCCTC CTTGATTGTA TGAATAACCC TGAGATCACC 

TCTTAGAACT GGTTTTAACC TTTAGCTGCA GCGNCTAGGT OIAWCGNTOT GTATATATAT 

GACGTXGTAC ATTGCA.Ch.TA C CCTTGGATC CCCACAGTTK GGTCCTCCTC CCAGCTACCC 
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CTTTATAGTA TGACGAGTTA ACAAGTTGGT GACCTGCACA AAGCGAGACA CAGCTATTTA 
ATCTCTTGCC CAGATATCGC CCCTCTTGGT GCGATGCTGT ACAGGTCTCT GTAAAAAGTC 
CTTGCTGTCT CAGCAGCCAA TCAACTTATA GTTTATTTTT TTCTGGGTTT TTGTTTTGTT 
T^GTTTTCTT TCTAATCGAG GTGTGAAAAA GTTCTAGGTT CAGTTGAAGT TCTGATGAAG 
AAACA.CAA.TT GAGATTTTTT CAGTGATAAA ATCTGCATAT TTGTATTTCA ACAA.TGTAGC 
TAAAACTTGA TGTAAATTCC TCCTTTTTTT CCTTTTTTGG CTTAATGAAT A.TCATTTATT 
CAGTATGAAA TCTTTATACT ATATGTTCCA CGTGTTAAGA ataaatgtac attaaatctt 
GGTAAGACTT TAAAAAAAAA A 

(2) INFORMATION FOR SEQ ID NO: 192: 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 2113 base pairs 

(B) TYPE: nucleic acid 

(C) STRAMDEDNESS : .* cc ub i e 
- (D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 192: 

AAATAATAAT AAMAATAAAT AAAAATWAAG TGCTTAKTGT AACTCAGCGG ACAGGGCTCC 

CAGCTGCTCT GGCACGTGGG ACACCYTCCA CCCTGCACAC AACAGGCATG CAAAGAGGAC 

TGGATATGGT GGGGTAGAGT GCTTCTGGTG TGTTCACTTT AAGAAAACAT CTGCCAAGAG 

AGAAGAGTGC C CAGGAAAG A CCAGGAAAAT ACAAGTACAT GGCTGCTTCA. TACCATATAC 

CCCAATTCTT TAAAGCAGCA AAAGGCACTT TTTTTTTCAG GCCAGAGTGA ATCTAAAACA 

AACCTGGCTT TGCTTACAGG GAAGCTGTCC CAGAAGGACT GAGTGATGCC TCTTGTTCCC 

TAAGGTCTGG AGAGTCTTTG CAAGTTTCCA ACGACATTTC CAACCAGGTG GGAGAGACCA 

GCAGTTGACG AGACAAGTCA GACCCAAAAA ACGACGCCAA GGTAGTGAGT GGGTGCCTAT 

TTGGGAGTAG GATGATTTGA GGAAAACAGG AAGAAAAACC GGTCAGAAAG TGGCACTTZG 

GAAGTGGAAA ' GCTGTTTGCA AATAGCAACT CTGGCTAAAG CGAAAATGTT .AATCAAGTAG 

AAAGTAAAAT TCAGGATCTT AGAAGCTCAT CCTTC T GATG AGAACTATTT TTT T TT CCGT 

GAAGGAACTA TTATTACTTT AAAAGTGAGG GTAATTTACA TATGGGGTGT ATATATTCTA 

AAAATAGTAA TAAAAGTACC TTTTATAAGC AATGTTGTGT GGCTTGTAGA AGAAAGCAGG 

GAGGAAAAAA AGGCAGGCAA AACTAGTCTA GGTCTAGGCC CTAAAAATGA GCTTCCTTCC 

CACTTGACTG GAAACGCCCA TGTGATTTCT AGGCTGAAAA TAGGTAGGAT TTAA.CGAGTA 
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ACCTAGTTCC CTTCTGTCTC TGATTTCTGA TCAGCTGATG GAGCTGCTAG TAAGAGGGGC 960 

GGn.TCA.TGCT CCCAGACGAG TCCTTTGGCC TCTTGCTCTC CATCCC\AGC CTGACTCCTT 1020 

CiJGCsGCAGC CCCCTCCTTC TGTGTCCATC TGATGCAGGC AAGCAGGAGC AGTAAGAGGG 1080 

CATCCCATGT TCCAGTTCAC CTTCTATGGG GTGACTAEGA GGTTCCCGGT AACTAGGGCA 1140 
GCCCARGCCC AGCAGGTTGC AAAAGCAGCT GCAAGCTTCA GAAACC GACT TCCTCCAACA . 12Q0 

C CAGGGAGG7 GGCAGAGAGC CCATCCAAAA GCCCAOTGGG AGAGGCATAA GATTCTCTGC 1260 

CAGGGCCCCA GGTCCCCTCT GTGTCAGGTA GGCTCTGCTA CTGGCGTGTG AAGTAAAGGC 1320 

AAANACAAAC GGGCAGGGCA GGGTGGCAGG AATAAAAAAC TCTGGACAGA AACCCTTTTA 13 30 

ATAAAGGAAA TTCCACCCCT CCCAATCCTT CCATGGAAGG GTGAGACCTT AATGTGATGT 144Q 

AAGAGGAAGG TCTTCTCTGG CTTTCAGGGA AACAGCTGCA GCTGAAACTT AGGGGCOCAT 1500 

TCCAGGGCAC TTTTCACCAC AGCCAGTGCA GCCGCTCCAA GTGCCACTGT CAGCCCCATC 1560 

ACTGCCAATT TCACAAAGCG GTTGGTCCTT GGCTTC-GTCA GGACATCTTT TGTTCGATCT 1520 

TCAGGCCGCA GAAGTCCCCG AANACCGCTG CCGCAGCACC ATATCAGGCC TCTGCTGGGC 1530 

■ TGATGCCAGC TCAAAGTCTT TGAAAGTAGA GGCTGCCGTC CTCTCAGCTT GCTGTTGGGC 1740 

AGCGGCCTCC CGAGCAAGTT CGGATGGGGG AAACTGAACA AAAAGGTCTC CTSTCTGCTG 1300 

ATCAGTGTCT CATAGGGCAA GTCCTGAGGG ATCTGGGACA AC AGGTGGTG GACCGAGGCC 1350 

ATGTGACAGT CACAGTCCAG GACTTCCTGC TCGCGATAGA ACACAATCAC GGCTGCAAAG 1320 

TAAATCGGCA TCAGTGGGTG GC.\GGCCAGZ AAGAAGTCAT ATAACCGCAC GACGTGCCTG 1930 

AAGTCAGA.CA GGLACATOCCC AAACCAGGTG ATGAGCCAGC TGAGGGCAAA GATGGTCCCT 2040 

ACCTCAGCAC TCTGCATGAA GTCATGGAGC TCTGGATTGA CCTGGTCAAT GATGGGCATC 2100 

AGAT AGTTT A ATATATGC 2113 



(2) INFGRMAT I ON FOR SEQ ID NO: 193: 

(1). SEQUENCE CHASACTSRISTICS : 

(A) LENGTH: 1533 base pairs 

(B) TYPE: nucleic acid 

(C) STSANuEDNESS: double 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCSZPTION: . SEQ ID NO: 193: 

CCQGCnrZC-ZG CTCTGTGTCA GCAGCCGGGC GGCGCTGGGG GGGGACATGG CAGCCTGTAC 50 

AGCCCGGCGG CCTGGCCGTG GGCZGCCGCT GGTGGTCCCG GTCGCTGACT GNGGCCCGGT 120 

QGCC^J-CGCC GCTCTGTGCG CC-CCCGNZ.GC TGGAGCCTTC TC GC CAGCGT CGACCACGAC -130 
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GACGCGGAGG CACCTCTCGT CCCGAAACCG ACCAGAGGGC AAAGTGTTGG AGAC^GTTGG 240 

TGTGTTTGAG GTGCCAAAAC AGAA.TGGAAA ATATGAGACC GGGCAGCTTT TCCTTCATAG 300 

5 

CATTTTTGGC TACCGAGGTG TCGTCCTGTT TCCCTGGCAG GCCAGACTGT ETGACCGGGA 260 

TGTGGCTTCT GCAGCTCCAG AAAAAGCAGA GAACCCTGCT GGCCATGGCT CCAAJ3GAGGT 420 

10 GAAAGGCAAA ACTCACACTT ACTATC.-GGT GCTGATTGAT GCTCGTGACT GCGCACATAT 430 

ATCTCAGAGA TGTCAGACAG AAGCTGTGAC CTTGTTGGGT AACCATGATG ACAGTCGGGC 540 

CCTCTATGCC ATCCCAGGCT TGGACTATGT CAGGCATGAA GACATCCTCC CCTACACCTC 500 

15 

CACTGA.TCAG GTTCCCATCC AACATGAACT CTTTGAAAGA TTTCTTCTGT ATGACCAGAC 650 

AAAAGCACCT CCTTTTGTGG CTCGGGAGAjC GGTAAGGGCC TGGCAAGAGA AGAATCACCG 720 

20 CTGGCTGGAG CTCTCCGATG TTCATCGGGA AACAACTGAG AACA.TACGTG TCACTGTCAT 73Q 

CCGCTT GT AC ATGGGCATGA GGGAAGCCCA GAATTCCCAC GTGTACTGGT GGCGCTACTG 340 

TATCCGTTTG GAGAACCTTG ACAGTGATGT GGTACAGCTG GSGGAGCGGC ACTGGAGGAT 900 

25 

ATTCAGTCTC TCTGGCACCT TGGAGACAGT t GCGAGGCCGA GGGGTAGTGG GCAGGGAACC 960 

AGTGTTATCC AAGGAGCAGC CTGCGTTCCA GTATAGCAGC CACGTCTCGC TGCAGGCTTC 1020 

30 ' CAGTGGGCAC ATGTGGGGCA CGTTCCGCTT TGAAAGACCT GATGGCTCCC ACTTTGATGT 108 0' 

TCGGATTCCT CCCTTCTCCC TGGAAAGCAA TAAAGATGAG AAGACACCAC CCTCAGGCCT 1140 

TCACTGGTAG GCCAGCTGAG GCCCCAAGTG CCCAGGCTTG GTCACGGGGA AGAA.CAACTC 1200 

35 

TCATCCCACA ATTGCTGCAG AACTCTTCTC TCCCCATCAT GGGCGACAGT GGGTCTCTTA 126Q 

ATTTGATTGT GGGGTTCTTT TTGTGGGGAG GGGTGGTATA ACTTTTCTTC AGAAGACCCA 1220 

40 TGTGGGACAC CTCCAA.GGCT GGCCTCCTCA TAAGCCCTGC CTACACCATG TTCCAGTAAA 1230 

CCTCTCCACC AAGGAACTGT GTTCAGCTGC CACAGGCCTG GAGGAGTTTC CTGGCCTGTC 1440 

ACGTGAGGTT TGATCAGTAA ACCAGTGCAS GYTTGGCCAA AAAAAAAAAA AAAAAAAAAA 1300 

45 

AAAAAAAAAA AAAAAAAAAA AAAAAAAAAA AAACTCGA 153 3 



50 

(2) INFORMATION FOR SEQ ID NO: 194: 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 1093 base pairs 
55 (3) TYPE : nucleic acid 

: " (C) STKANDEDNESS : double 
(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION : SEQ ID NO : 154: 

60 
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AGACCCTGTC TCAAATAATA ATAATAATAA TAA.TCTTATT TTGGAGAATA AAGPGACCTS 
TGGATTTGAG GTGCCATTTG GGTAGAAAGA AAAGACGTTT ACACCGAGAA ATAGTCTGTG 
TTGCCCTGAA GGAGCAGAGG GATGCATCGC TGGAGGTC-AC CTACAGTTGA AGAAGACTCA 
TTATGACAGA CCTTGTCCTT CTTCCTTGTG' gaaagtgttt cctctgctgc TACTGCTCAT 
GAGACTCTTC CCCCTCCCTG TOCCAGGGAA CCAAAGGGCT TTMCTACCAC ACCCTTTCTT 
NGCCCCCCGC CTCCCATGTC TGCTGTGCCT TTGTACTCAG CAATTCTTNG TTTGCTCCCA 
TTATCTTCCA GCCGGATACA GAGTGAATAG TTAACCACAC TTAGGTCAAA TAGGATCTAA 
ATTTTTGTTC CTGCTCCNGT GTAAAGAGGC CAGTGTTTGT GTGTTGCAAG CAGCCTTGGA 
ATAGTAACTC TTCTCATTTG TTTGGGATCT C^CAMCAAG TTCCAGAATC- ATACACGGAT 
CAGTGCAGAA GTTCATCAGG CTCTCGGACC TTAGGGCTGT TGGAGAAGGC TTCAGCAGCA 
GAACTGATGG TKAWKG" TTCG TGTTCTCCAT CCTCAACTTT CTTTGCTTCG ATCATACACA 
AGAATACATT TGGAAGGGCA AAAAATGAAC ACTGTTGTTC ATTGCAGCCG TGTTTTGTGA 
CAGAGATGCA CAGTCTGCTG TGPAGAGCTT CTCTCAAGTG G3ATYTGGGA GTCCATGCGA 
GATCATGGTG CTTCATGAGA GACTGACAGC •TATCAGGGGT TGTGGCACTT AGTGAGGACT 
CTCCTCCCCC AGTGTGTGCT GATGACACAT ACAO-CCTGA CAATAGCTTG A3TCTTCTCT 
GTTCCTTTTA CTCTGTAGCC AAGATACACA TGATTTAAAA CCCTTTCTAA ATATCTATCA 
TGGTTCATCC TTGTCCAAAT GCAGAGTCAG AGCTATTTGT ACTTCATTAT TATTTCGAAG 
GCGAAT AGTT C-GCTTTCTTT TTGCAAAWVT AATTAAAGTT TTTGTATGTT GCAAAAriAAA 
AAAAAAAAAA. CTACGTAG 

. (2)- INFORMATION FOR SHQ ID NO:" 195: 

(i) SEQUHMCE OiAHACTE?-X3TIGS : 

(A) LENGTH: 1001 base pairs 
(3) TYPE: nucleic acid 

(C) ST3AMDE3NESS : double 

(D) TOPOLOGY: linear 

(Xi) SEQUENCE D£SC?JI?TION : SEQ ID NO: 195: . 
GAATTCGGCA CGAGATAGGT TGCATCTCAT CCC-GTAAAA CCACTTATTT ATAACATATC 
AACGTATTGA CAAGGTTGAA. GAGCAAGATT GTTCTGAGGT GAGATGCAAA TTTCAAAGGG 
GTGAGCACTA ATTGTTCCAG TGA1TGTTTA TTTATTGGCT ASGACATAAT TACTCTCTTT 
GAGGTTACAC ATCTGCCTCC AGGTTCCTGT GTGCTTGTGC CCTTGGGATC AGGCCAGGGC 
AGACTGTGAT CACTGAGATT CA-ACTCCCA GAKTAATCAG a^AGAGCTTT CTAGAGACOA 



WO 98/54*963 



443 

AGGCCAGGCC TGATCCCTGA GGGATGCATG AGAA.GGCTTG GAATCTCATT CTGCTATGGT 
GGCTCTCTCT TGATCTTCTT GGAGTAGCAA AAACAGCAAT GTGGGCC GAA TGGTGTGGCC 
TAAA.TGA.TCA CAAAGGTAAA TGAGTAAAGG GCT C.AGCAG A TGAGTAAGGA GCCTTGTCCT 
GAGAAATTAG CACTGGGCTC TGCATTCAGA AACATGTGAT AAGCATTGCC CA.TTGCACAT 
TGCCTTTATT GTGTAAGGA.C ATGAAATTCC AGTTTTGCAT AGCTAGTGAT GAATACCTGA 
AGGGAATTGC AGA.CATA.TTT TA.TTTTATTT TTAATTGA.CA GATGGAA.TTG TATATATTTA 
TCATGTACAT AATCATGCTT . TAAAATATGT ACATTA.TGGA ATGGCTAAAT CAAA.CTAA.CC 
TAGGCATTAT CTCATATAAT TGTCATTTTT GTGGCGA.GAA GACTAAAAAT CTACCCTTTC 
AGCATTTTTA . AAGAA.TACAA TGTGTTTTAT TAACAACAGT CACCATTTGG TACACTAiGAT 
CTCTTGAACT TCTTCCTCTT ATCTAACTGA GA.TCTTGTAA CCTTTGA.TAA CAGCTCCCAA 
GCCCTTCCCC AACCACTGCT CCACCCGTGG TAACCACCAT TCTATTCTCA. ACTTCCTGGT 
AATCACCATT CTAGACACAG GGAAGACTCT CTACCCTGTG A- 

/. 

(2) INFO P.MAT Z ON FOR SEQ ID NO: 196: 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH; 1443 base pairs 
(3) TYPE: nucleic acid • 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

(Xi) SEQUENCE DESCRIPTION : SEQ ID NO: IS 6: 
.ATAAAjCTGAA ATAGGTCATG CAAATATAAA ATATTATTTT TAAATTATTT GTCA.TAA.GAA 
ACGATGGTGG CCATATTTTG CTTTAATAAT GGAAAAAATG TGGTTAGCAT TCTKTGGAAG 
GTGGTCATCA GATAGTAGAC ATTTTCTAGG ATTTATTTCT ACCTGCATAT GTGGAAATGT 
GTACTAjCTTT AGATTTATWT AATGGCAGCT AACTCAGAGG CATCAAAATG TGCTAATGGT 
GTAATATGGC CTTTGTCTTG CTGTYCTGTT T7GTARGCCT TCAATCAAGC ARGGGCAGGG 
CCGTACAGTG AACTTGTCCT TTG3CAGACG CCAGCGTCTG CCCCTGACCC CGTCTCCACT 
CTCTGTGTCC TGGAGGAGGA GCCCCTTGAT GCYTACCCTG ATTCACCTTC TGCGTGCCTT 
GTACTGAACT GGGAAGAGCC GTGCAATAAC GGATCTGAAA. TCCTTGCTTA CACCATTGAT 
CTAGGAGAjGA CTAjGCATTAC GGTGGGCAAC ACCACCATGC ATGTTATGAA AGATCTCCTT 
CCAGAAACGA. CCTACCGGTG AGTGCAAGGG AGTAGAAATG TGCATCAGCA CATCAGCACT 
TGGGGATCTA AGT AAAGGTC TCGGGGAAAA TGACCAA.GTG GATGTCATCT CCCAGCTGTT 
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TCTAAGAGCC C-.GATGTCCA GACTA.TTGTC T CACrTTGAT CCCTCAS™ .-^-.-.cr:- 720 

TGAAAAAGCC ACACTGGTTC ACC-CL-.CTC.Vr TC^CGGTTT TGTGTCCACT YCAAjIITrGCA 730 

5 CCGTCTCTAC CCCAGAGTGG ACTO-?A.TCC COA^TCATC CTCTGAACAT TG~— GAGA 340 

AATTATAAAA. GGGCTTTGGC AATATGTTAG C CCAA^AACT ?GG LV.-J- ' - C CAJLAAA7TG7 ■ 300 

GCCGACMTTA ACAGTGGCTT AAATGATGGT AAAAJ TTA . AGATTTCTAA AAGG?.TGGCA • 960 

10 

TTGGAGATAC GTTGACTTTT ATTAAAQGC CTATAGTTGT TTAATGAITT CTAAAAAAAT 1020 

ATCTGGAGCT CAGGGGTTCA. ACTGAGGGAA CAjCATGTTGA C-PA.TCATTGT CT .--77 AA.TTA. 10S0 

15 AATGCCAGGT AACCCGTTGA AATTATGAAA AA-C-.TrTTGC AGGTAC-CAOA AAGZAC-CTCA LI 40 

GAGGATAGTT CTGTTATGGA GP-AGATQAAA TGGTT7AGTA GTGTACCAAC TA.IGGAAAGG 1200 

TGAGCTTAGA TTTGGATAGT AAAACCTCAA. GACC— .TTT AAAAAGTATT TTAIGAA-TGC L2SG 

20 

AGCATAAATA ATTTAATTCA GTGTTAANAT GGC-AGGGT A GTATATTGA3 CZ GAAIGTGA 1320 

AAAGAAA.CTC ACATTGGGAG AATGCCACCT CTTCrTTATA .-GATAGCT7T GA.-,GATA£CA 1330 
25 TTTTAGACAG ATGGAAATTG . AAT AGCTTT A GA.-AAGGGAA ATGTTTGArC TTGGCG-AAAA 1 1440 

AAA '/ 1443 

30 

(2) INFORMATION FOR SEQ ID MO: 197: 

(i) SEQUENCE CKAEACTZRI.3TIC5: 
35 (A) LENGTH: 1232 cass zairs 

(3) TYPE: nucleic acid 
<C) STKAiNfDEC-MESS : ccuble 
(D) TOPOLOGY: iir.sar 

40 (Xij SEQUENCE DESCRIPTION: SEQ ZZ NO: 197: 

GAAAAAAAAA AGTATGACCC AGTAGCTAGG CACCTTTSGC CCCGCCAAGT CGACACA.TAA 60. 

AATTAACTGT CACAGTA.TCA TCTTAGAAGT GAAftSPMCC CCTTTATCCT C-CAGTGCCCC 120 

45 

TCTACCACCA CCTACTGACA AAGAACATGG T3CTATCTGG CATGGGAGAA ATGTTCAGTT 130 

TGCTATGGCT TGTATGTGTC CCCTCAAATT CAAG7GTTGC CAATGTQACA CCATCAAGAG 2 40 

50 GTGGGGTCTT TAAGAGATCA CTAGGCCATG AGGGATTCTC TTAGGACTGG GAIGAAGGCC " 300 

CATAATAAAA GAGGTTTO.G GGAGCATCCT GCTAG2TTGC CTTCTGTArG CGAGAACACA 3b0 

GCAAGAAAGC CCTAGTCAAC AAGTGCCAGC TCCTTGATCT TAGACTTCCC AGCTTCCAGA 420 

55 

ACTCTGAGAA ATACATTTCT GTTOCTTACA AATT AC C GAG 7CTCCTGTAT TC7GTTAT.-G 4S0 

CAGC ACAAAA TGAAGATACC ATACCTGAAC AC C7GAACAT TCTTCAOAG GTAGTAAATG 540 

60 CACTGCTTTA TTCTGGTCTC AGTATTGTGT C-CTTAATAA.G a-AA.TGA3.-A .-GCGTGGATC 600 
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25 



AGGGCATAGG 
GTATTTTCAT 
AGATGTCTAA 
CAAAATTG2-1A 
TAA.TGGTACT 
ATTTATTGCT 
GTGCAGAATT 
GTCTAATTCT 
TCTCGTCACT 
TTTCCACTGT 
CTTTATACAC 



ATGiAC^AGT TACTGCTAGA 



CATTMCTTGT CTCTTCGGAA GCCA- 



CG1-.2A-JGA7T 
COCACTA-A.T 



AAACAGCTTA AGTATTTGTC TAGAAAGGTG G7GCAGTGGI 
AAT AATTT CA. AAC-GGGCTAA AiZ-GACTA: 
ACGACTGTGA AATTTAAAAT 



GGTCCG 

AAAACCTGGT AAAC-.GTTTA ATCC*iTGT GA AGTCCA 
ACTGG-GAGAC TAATAGTCAC CTGACTTGT G C 



7GCTCTT 

GTGGT 



:gtgg 

GGTTACAAAT AAGTAACTGC 0--A.CTA-.GG TTTGTAAAAA GCAA2ACTGA 
CGTTTGGTCA ACAATGTAAA ^JCGTIGCCAGT GTCTCCC 
GTATACAATA CATGCATGAT CTGTATCCAG C-.TCAGGGGG 
GACGGGGCAT GCCACATCAA ATTAAAT7AC CGGG.--GA.-Ar C-CAATGGCAA 
AAAAAAACTC GA 



560 
720 
730 
340 
900 
960 
1020 
1080 
,1140 
1200 
1250 
1232 



30 
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50 



20 



(2) I*nFOP*lATICN FOR SEQ ID MO: 193 f 

■ (i) SEQCJEMGH CrAJACTZRISTICS : 

(A) LZtfGTK: 951 b_.se pairs 
(3) TYPE: nucleic acid 

(C) ST?ANDEDMES3 : double 

(D) TOPOLOGY: 



(;<i) SEQUEMCE. DESCRIPTION: S3Q IG *.7G: 193 
ATTTCGGAAC GAGGACTGAA Gl^G^GCOZ' CGGCAGGGTA. 
TGTGGTAACT AAA3AATGTT 'TCTGTTTTGT TAATTAGGGT GGGTGTGTGG 



GGGGGATCTA 



TGGTTAAGAG AATGAAAAAC TGAAAA-AAT a-iGAATACAG GAAAGGGC" 
TTTTGCTGTG TTTACAGGTT GTTA3-.TGCTC TACTGTCGGT 'GrTTCA.-JG.--G 
ACTGGGCAGG TCGTTTTGTG TCCTGAGC CC TA.TGCCCAGC CG AC GGTAG A 



GTTTAGATGT TTGATTTTGT TGTGTTTGGT ATTGTTATCT TA-AjGGTGGA 
ATGCCAGACA. TCA^A.TTAAG GTOAATTAA GCTCTCGTGT A-A.TGTTTAA 
TATATTCTAA TTGATCCCAG C'GACTGATGG ATGTACTTTA GCTACGTCGG 
ATATTAATTT TCGACATCAG COCATCAsGAT CTTGAGAACC A--CAGTTAGG 
TGTGTACTAA TGTTTCACCT GGATGGAGCG TTCATTAAGT TGGTAGGAAA 
ATGATTATGT AGTTTCTGGA "TTAAA-AAAT TTGTOTGTGA. AGTTG-GTGTG 



TGTTG 
.A-.GG-TOGGT 
GAAGGGTGAC 
A-ACCTAATT 
CTAAA.TAAGG 
GAGA-TTCCG 
ATAGAAAjGTG 
GAAAGTGCAT 



50 
120 
130 
240 
300 
350 
420 
430 
540 
500 
650 
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GTGGAATTAA TGGGACAGTG TGCCCTTTGT GTTAGATGTT AGAGCAAAAG AAAjC^OTTA 
TAjGTGTTAGT ATTGGAGCAC TTTGAAGATA aATATTTTCA GAAAAGA.TGT AGGA.TTTAAA 
AGTTAAATTT TAAATTTTAG AAAAAJ^ATAT ^.TGC-CAATT GGAAATAGTC ACAATCAAGT 
TCTTCATCCA GTAGGTGTTT AACAGTGTTA TTTTGCCA.CT GGTAATGTGT AAA.CTGTGAG 
TGATTTACAA TAAATGATTA TGAA.TTCAAA AAAAAAAAAA, AAAAAACTCG A 

(2) INFQEMAT ION FOR SHQ ID MO: 129: 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 1740 base pairs 
(3} TYPE: nucleic acid" 

( D ) TO POLCGY : I inear 

(Xi) SEQUENCE DESCRIPTION: SEQ ID MO: 139: 

TTATTATAAT AATGATGATG ATTGCAAGGA AAAAACCTA.C AGCGAATGTT CCATTTCTAC 

COCGCACGCA GACA.CTCTCC CTAACACTGA t TAACCTGAGC CCCCAGCACT GGA.CGGAAGA 

ATGCTGGCGT CTCCGTGTGT ACTGGTTCAG GGTTCTGGCC CCAGCCTTGT CAGGACCCCC 

TGGTGTCC AG AGCCCCCACC CCTCCCGCAA CAAGCAGCTG A.TGCCCCAGT GATTCTCTAT 

ACATTTTTCA CCTCGGCCAA "TATGTCCAGG AAAACTGCTT ACTTCTCTTT TCTTGCCTGG 

AGCCTTCATT GTTCACCCTT ACGTTGCAAT ATAGGAA.TTA ATGCTACAAA ATAAAAGTAA 

AGCTTACCTG AAAAGTGCAT AGTTTGGGGC AATGGTATCT ACA.TCTCCCA CTGTGGGAAA 

ACCAGCAAAG CATCAAAACT CTCAATTCTC CTGTTACCPA ATGCAGATCT GAATTATAAG 

ATGTTTATGT TTGACCATTG TTTCAACAA.T GGGATTTTGT TACGAATTAT CCCTTTAACT 

GAAACCCTCA GTTTTACTGT TTACATTATT AGGAAAAGAG GGATATCTTT TGAATCTAAA 

AATTTGATGT ACAGCATGTG ATTTTTGAAG TTTACATGTA AAGTCACAGT ATAGGTGAAA 

TAACGTTTGT CATATTTTGA GACGTATCCT GCAGCCATGT TTTTACGTGA GTGTTTTAGT 

CAAAGTACAT GGTAGACAGT CTTTCAGAA.T AAAAGGAAAA C^A.TTTTTTT TCCTCCAAAT 

GTACA.TTTAT CAAGCTAA.TG ATTGATTTTT TTAAAAAjGAG ATTTCGCCCC AGTCTGGTTT 

ATGAAAGTTC ATTGCCCTAA ACTGTGCTGA TTGTTTTTAA TCAAGTTATA. AA.TTTCCAAG 

CTAGATCATG TATCTACCAA CTCTCCTGCA. TTTTCCAAAA GGCA.TTGAGC TTAAATA.TTA. 

GTCTTGCTTA GAGTAGGTTA TCCACTTACA TGCTGCGCTA AA.GCCATGCC TTTGAAACTC 

CTTGTTTAAA ACATGATATG ATTTTTGTGG GCAGTTTCAG AAAAGAAAA-C AAACAAACAA 

AAATCGACCC TTTAATTATT ACTTGCAA.CT CAACAGA.TCT CCCTGCCGTA CTCOCTTTTC 
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OGGAACTTT acttcagggc tgtocagatt gcagttgtgc 

TCACAGAGTC TTTGGAAGCC AGCAGTCGTG CCCTCCGTAT 
AGATTTGGTA TCCTCAGCAG CCAGTGTTAA CACCACTGTC 
TTTTATGTAT TTAAAGTAAT CCATACTATG ATTTGGTTTT 
GCATCAGATC AGTTTTTGTG TTGTGAAGTT CTACTGTGGT 
TGAGACCCTG AAGTAAftGAT AAGGTACACA TACATTATTT 
GCCAATCTGT GTATGCTTTT AGAAGTTTAC AGAATGCTTT 
AGTCTGTCAT TTATTTCTGT TGATAAACCA TTTGGACAGA 
ATCTCCTAGT GCTAACAATA CACTCCAGTC ATGAGCCGGG 
TGATGACT CA MAAAAA*AAA AAAAAAAAWC YCGGGGGGGG 



CCCGTGTATG 
ACTGTCCACT 
ACGTAGTTAM 
TCCCTGCACC 
TTGACCCAAG 
GAGTAACTGT 

GTGAGGACGT 
CTTTACAAAT 
GCOGGTAACC 



TGGATCTAGT 
CATTTTATGT ■ 
CAGATTCATC 
ATTAATTCTG 



TTCCTTGGGG 
TATAACAAAC 
TTGCCCTGTT 
AAAGCACTTT 



1200 
12 SO 
1320 
1330 
1440 
1500 
1560 
1620 
1530 
1740 



30 



35 



40 



45 



50 



55 



60 



(2) INFORMATION FOR SEQ ID MO: 200: 

(1} SEQUENCE CHARftCTEHZSTICS : 

(A) LENGTH: 1707 base pairs 

(B) TYPE: nucleic acid 

(C) STHANDEDNESS :■ double 

(D) TOPOLOGY : linear 

<xi) SEQUENCE DESCRIPTION:- SEQ ID NO: 200: 
GCTTATAGAA GGGAGAGGAG CGAAEATGGC AGCGCGTT GG CGGTTTTGGT GTGTCTCTGT 60 

GACCATGGTG GTGGCGCTGC TCATCGTTTG CGACGTTCCC TCAGCCTCTG CCCAAAGAAA 120 

GAAGGAGATG GTGTTATCTG AAAAGGTTAG TCAGCTGATG GAATGGACTA ACAAAAGACC 130 

TGTAATAAGA ATGAATGGAG ACAAGTTCCG TCGCCTTGTG AAAGCCCCAC CGAGAAATTA 240 
CTCCGTTATC GTCATGTTCA CTGCTCTCCA ACTGCATAGA CAC-TGTGTCG TTTGCAAGCA . 3QQ 

AGCTGATGAA GAATTCCAGA TCCTGGCAAA CTCCTGGCGA TACTCCAGTG CATTCACCAA 360 

CAGGATATTT TTTGCCATGG TGGATTTTGA TGAAGGCT.CT G.ATGTATTTC AGATGCTAAA 420 

CATGAATTCA GCTCCAACTT TCATCAACTT TCCTGCAAAA GGGAAACC'CA AACGGGGTGA 430 

TACATATGAG TTACAGGTGC GGGGTTTTTC AGCTGAGCAG ATTGCCCGGT GGATC'GC CGA 540 

CAGAACTGAT GTCAATATTA GAGTGATTAG ACC CCCAAAT TATGCTGGTC CCCTTATGTT 600 

GGGA1TGCTT. TTGGCTGTTA TTGGTGGACT TGTGTATCTT CGP-AC-AGTAA TATGGAATTT 560 

CTCTTTAATA AAACTGGATG GGCTTTTGCA GCTTTGTGTT TTGTGCTTGC TATGACATCT 720 

GGTCAAATGT GGA^-CCATAT AAGAGGACCA CCATATGCCC ATAAGAATCC CCACAGGGGA 7 SO 
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CATGTGAA.TT ATATCCATGG AAGCAGTGAA GCCCAGTTTG TAGCTGAAAC ACACATTGTT 340 

CTTCTGTTTA ATGGTGGAGT TACCTTAGGA ATGGTC-CTTT TATGTGAA.GC TGCTACCTCT 900 

GACATGGATA TTGGAAAGCG AAAGATAATG TGTGTGGCTG GTATTGGACT TGTTGTATTA 9b 0 

TTCTTCAGTT GGATGCTCTC TATTTTTAGA TCTAAA.TA.TC ATGGCTACCC A.TACAGCTTT 1020 

CTGATGAGTT AAAAA.GGTC C CAGAGATATA. TAGACACTGG AGTACTGGAA ATTGAAAAAC 1030 

GAAAA.TCGTG TGTOTTTGAA AAGAA.GAATG CAACTTGTAT ATTTTGTATT A.CCTCTTTTT 1140 

TTCAAGTGAT TTAAATAGTT AATCATTTAA CCAAAGAAGA TGTGTAGTGC CTTAACAAGC 12Q0 

AATCCTCTGT CAAAATCTGA GGTATTTGAA AATAATTATC . CT CTT AACCT TCTCTTCCCA 1260 

GTGAA CTTT A TGGAACATTT AA.TTTAGTAC AATTAAGTAT ATTA.TAAAAA TTGTAAAA.CT 1220 

ACTACTTTGT TTTAGTTAGA ACAAAGCTCA AAACTACTTT AGTTAACTTG GTCATCTGAT 1330 
TTTATA.TTGC CTTATCCAAA GATGGGGAAA GTAAGTCCTG ACCAjGGTGTT CCCACATATG • 1440 

CCTGTTACAG ATAACTACA.T TAGGAATTCA TTCTTAGCTT CTTCATCTTT GTGTGGATGT 1500 

GTATACTTTA CGCA.TCTTTC CTTTTGAGTA GAGAAA.TTAT GTGTGTCATG TGGTCTTCTG 1560 

AAAATGGAAC ACCATTCTTG AGAGCACACG TCTAGCCCTC AGCAAGACA.G TTGTTTCTCC 1620 

TCCTCCTTGC ATATTTCCTA CTGAAATACA GTGCTGTCTA TGATTGTTTT TGTTTTGTTG .1630 
TTTTTTYGAG ATCACGYTAC TGGGCTC . - 1707 



(2) INFORMATION FOR SEQ ID NO: 201: 

( i ) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 779 base pairs 
(3) TYPE: nucleic acid 

( C ) STRANDEDNES S : double 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION : SEQ ID MO: 201: 



CTGTCCCCAG TGTTTCCAGG TAATGACTTG GCACTCCAGA GAAAGTTTCA TRCTGTTGCG 50 

TGTGGTGGCT CCAAGCCAAG CACCTGGCAT GCAGGTCAGC CCTTCCCAGC COGCGTCGCG 120 

TCGTCCTCTT CACAGATGCC ACGTTGCAGC CCCAAGGCCT CACCATTTTG CGTTTTTTAG 130 

AAACCCATTT TCTTGGTCAT TTATAAAGCT GCTTTATAGA TA.TCTTTGA.T CCTGGCA.TGC 240 

CTTGGTTTCC TCTCCCTTCC CTCTTTCCAA TCCTGGTTTC CTAACCTCCT CTTGTAGTAA 300 

TTCTCAACTC AAjCTCAAAGT CCCAAGAATT TGGAATGGTA GGATGCTGTG CGGGGAGCTC 3 60 

GAGGCTGAGG CATAATCACT GCTTCGGTTC TGCTCATCAG GGGACACGCT CCCTTACTCA 420 

TGGCAGCCAT GTTTGATTGT CACAGAGCCC CCCGAATACT CTGTCTATAG TGAOjCACTG 430 
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TAGGTGTCAT AAATTTTAAG AAACCTGCTT TTAAGTACTA TTTATAGGTT TTTCTGTTAT 540 

ACTTGCAACC TAJ3T7TTAAA ATACATGAGG ATTTTATGAA AGCTTTATAC AGAOATTTAT 500 

5 

AGGAAACTCA TTCTTTGATT TTAGGTGCCA TTTAAATTGA TAACACTTAC TTTATAAAAA 660 

GATGCTTTTT GTCTGGATAG AGCCTTATAG TTTAAAATAT CTTCATATAT TGCCATTTGA 720 

10 TCAAATAAAT TTCTTACTTA GAAAAPAAAA AAAAAAAAAA AAAAAAAAAA AAAACTCGA 775 



15 (2) INFORMAT ION FOR SEQ ID MO: - 202 : 



( i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH : 1517 base pairs 
(3) TYPE: nuciaic acid 

(D) TOPOLOGY ; linear 

(:ci) SEQUENCE DESCRIPTION: SEQ ID NO: 2Q2 : 

25 GGCACAGCTT TCTGTCTCTT CCTCGCTCCC TCTCTTTCTC TCCTCCCTCT GCCTTCCCAG 60 

r 

TGCATAAAGT CTCTGTCGCT CCCGGAACTT GTTGGCAATG CCTATTTTTT GGCTTTCCCC 120 

CGCGTTCTCT AAA.CTAACTA. TTTAAAGGTC TGCGGTCGCA AATGGTTTGA CTAAACGTAG 130 

30 

GATGGGACTT AAGTTGAACG GCAGATATAT TTCACTGATC CTCGCGGTGC. AAATAGCGTA 240 

TCTGGTGCAG GCCGTGAGAG CAGCGGGCAA GTGCGATGCG GTCTTCAAGG GCTTTTCGGA 3 00 

35 CTGTTTGCTC AAGCTGGGCG ACACATGGCC AACTACCCGC AGCCTGGGAC GACAAGACGA 360 

_ ACATCAAGAC CGTGTGCACA TACTGGGAGG ATTTCCACAG CTGCACGGTC ACAGCCCTTA 42 Q 

CGGATTGCCA GGAAGGGGCG A-AGATATGT GGGATAAACT GAGAAAAGAA TCCAAAAACC 430 

40 

TCAACATCCA AGGCAGCTTA TTCGAACTCT GCGGCAGCGG CAACGGGGCG GCGGGGTCCC 540 

TGCTCCCGGC GTTCCCGGTG CTCCTGGTGT CTCTCTCGGC AGCTTTAGGG ACCTGGCTTT 50 Q 

45 CCTTCTGAGC GTGGGGCCAG CTCCCCCCGC GCGCCCACCC ACACTCACTC CATGCTCCCG S6Q 

GAAATCGAGA GGAAGATCCA TTAGTTCTTT GGGGACGTTG TGATTCTCTG TGATGCTGAA 72 Q 

AACACTCATA TAGGATTGTG GGAAATCCTG ATTCTCTTTT TTATTTCGTT TGATTTCTTG 730 

50 

TGTTTTATTT GCCAAATGTT ACCAATCAGT GAGCAAGCAA GCAC^JGCCAA AATCGGACCT 840 

CAGCTTTAGT CCGTCTTCAC ACACAAATAA .GAAAACGGCA AACCCACCCC ATTTTTTAAT 900 

55 TTTATTATTA TTAATTTTTT TTGTTGGCAA "AAGAATCTCA GGAACGGCCC TGGGCACCTA 960 

CTATATTAAT CATGCTAGTA ACATGAAAAA TGATGGGCTC CTCCTAA.TAG GAAGGCGAGG 1020 

AGAGGAGAAG GCCAGGGGAA TGAATTCAAG AGAGATGTCC ACGGACCAAA CATACGGTGA 1080 

60 
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ATAATTGACG CTCACGTCGT TCTTCGACAG TATCTTGTTT TGATO--TTTC CACTGCACAT 1 1 40 

TTCTCGTCAA GAAAAGCGAA AGGACAGACT GTTGGGTTTG TGTTTGGAGG ATAGGAGGGA 1200 

5 GAGAGGGAAG GGGCTGAGGA AATCTCTGGG GTAA.GAGTAA AGGCTTCCAG AAGACATGCT 1260 

GCTATGGTCA CTGAGGGGTT AGCTTTATGT GCTG7TGTTG ATGCATCCGT CCAAGTTCAC U 2 Q 

TQ^rTTT ITT TTQrr«VQrTC Q 1 " TPT T fTTTT ijGCTG* 1 ^ - C ^G-CACJ^G^A iTir.— VQJj.JL'V " i ] 3Q 

10 ------- — — 

ATCCAACGGT ATAGATCACA AGGGGGGGAT GTTAAATGTT AATCTAAAAT ATAGCTAAAA 1440 

AAAGATTTTG ACATi^AAAGA GCCTTGATTT TAAAAAAAAA AGAGAGAGAG ATGTAATTTA 1300 

15 AAAA.GTTTAT TATAAA.TTAA ATTCAGCAAA AAAAGATTTG CTA.CAAAGTA. TA.GAGAAGTA 1550 

TAAAATAAAA GTTATTGTTT GAAAAAAAAA AAAAAAAAAW CTCGACCGCA AGGGAAT 1517 

20 

(2) INFGRMAT ION FOR SEQ ID NO: 203: 

(i) SEQUENCE CKABACTESXSTICS : 
25 (A) LENGTH: 1974 base pairs 
(B) TYPE: nucleic scid 
CC) STRAMDEDNES3 : double 
" ; (D) TOPOLOGY f " linear - ■ 

30 (:ci) SEQUENCE DESCRIPTION: SEQ ID NO:. 203: 

GAATTGGGCA CGAGGCTGAG GGAGCTGGAG CGGAGCAGAG TATCTGACGG CGCCAGGTTG 50 

CGTAGGTGCG GCACGAGGAG TTTTCCCGGC AGGGA.GGAGG TCCTGAGCAG CATGGCGCGG 120 

35 

AGGAGCGCCT TCCCTGCCGC CGG'GGTCTGG CTCTGGAGGA TCCTCCTGTG CCTGCTGGCA 130 

CTGCGGGCGG \GGCCGGGCC GCCGCAGGAG GAGAGGGTGT ACGTATGGAT GGATGCTGAC 240 

40 CAGGCAAGAG TAGTCATAGG ATTTGAAGAA GATATCCTGA TTGTTTCAGA GCGGAAAATG 300 

GCACCTTTTA CACATGATTT CAGAAAAGGG CAACAGAGAA TGC CAGGT AT TCCTGTCAAT 350 

ATCCATTCCA TGAATTTTAC CTGGCAAGCT GCAGGGCAGG CAGAATACTT GTATGAATTG 420 

45 

CTGTGCTTGC GGTCCCTGGA TAAAGGCATC ATGGCAGATG CAACCGTCAA TGTCCCTCTG 430 
CTGGGAACAG TGCCTCACAA GGCATCAGTT ' GTTGAAGTTG GTTTCGCATG TGTTGGAAAA .. 540 

50 CAGGATGGGG TGGCAGCATT TGAAGTGGAT GTGATTGTTA TGAATTCTGA AGGCAACACC 600 

ATTCTCCAAA CAGCTCAAAA TGCTATCTTC TTTAAAA.CAT GTCAACAAGG 'TGAGTGGG CA 650 

GGCGGGTGGC GAAATGGAGG CTTTTGTAAT GAAAGACGCA TCTGCGAGTG TCCTGATGGG 72 Q 

55 

TTCGACGGAC CTCACTGTGA GAAAGCCGTT TGTACCCGAC GATGTATGAA TGGTGGAGTT 730 

TGTGTGACTC CTGGTTTCTG GATCTGCCCA GCTGGATTCT ATGGAGTGAA CTG1X2ACAAA 340. 

60 GCAAACTGCT CAACCACCTG CGTTAATGGA GGGAGGTGTT TCTACGCTGG A-AATGTATT 900 
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TSCCCTCCAG GACTAGAGGG AGAGCAGTGT GAAA.TCAGCA AATGCCCACA AGCCTGTCGA 960 

AA.TGGAGGTA AATGCA.TTGG TAAAAGCAAA TGTAAGTXTT CCAAAGGTTA CCAGGGAGAC 1020 

5 

CTCTGTTCAA AGCCTGTCTG CGAGCCTGGC TGTGGTGCAC ATGGAACCTG CCATGAAGCC 1080 

AACAAATGCG AATGTCAAGA AGGTTGGCAT GGAAGACACT GCAATAAAAG GTACGAAGCC 1140 

10 AGCCTCATAC ATGCCCTGAG GCC^GC^GGC GCCCACCTCA GGCAGCACAC GCCTTCACTT 1200 

AAAAAGGCCG AGGAGCGGCG GGATCCACCT GAATCCAATT ACATCTGGTG AACTCCGACA 1260 
TCTGAAACGT TTTAAGTTAC ACCAAGTTCA TAGCCTTTGT TAACCTTTCA TGTGTTGAAT r 1320 

GTTCAAATAA TGTTCATTAC ACTTAAjGAAT ACTGGCGTGA ATTTTATTAG CTTCATTATA 1330 

AATCACTGAG CTGA.TATTTA GTCTTCCTTT TAAjGTTTTCT AAGTACGTCT GTAGCATGAT 1440 

20 GGTATAGATT TTC TTGTTT C AGTGCTTTGG GACAGATTTT ATATTATGTC AATTGATCAG . 1500 

GTTAAAATTT TCAGTGTGTA GTTGGCAGAT ATTTTCAAAA TTACAATGCA TTTATGGTGT 1360 

CTGGGGGCAG GGGAACATGA GAAAGGTTAA ATTGGGCAAA AATGCGTAAG TCACAAGAAT 1520 

25 

TTGGATGGTG -CAGTTAATGT TGAAGTTA.CA* GCATTT CAG A. TTTTATTGTC AGATATTTAG 1580 

ATGTTTGTTA GATTTTTAAA AATTGCTCTT AATTTTTAAA CTCTCAATAC AA.TATATTTT 1740 
30 GACCTTACCA TTATTGGAGA GATTCAGTAT TAAJ^AAAAAA. AAAATTACAC TGTGGTAGTG ' 1300 

GCATTT AAAC AAT ATAATAT ATTCTAAACA CAATGAAATA GGG AATATAA. TGTATGAACT 1360 

TTTTGCATTG GCTTGAAGCA ATATAATA.TA TTGTAAACAA AACACAGCTC TTACCTAATA 1920 

35 

AAGATTTTAT ACTGTTTGTA TGTATAAAAT AAAGGTGCTG CTTTAGTTTT CTGA 1974 



40 

(2) INFORMATION FOP. SEQ ID MO: 204: 

Ci) SEQUENCE C-LARA.CTE3.I3TICS : 

(A) LENGTH: 1057 base pairs 
45 (3) TYPE: nucleic acid 

(C) STSANDEDNESS : double 

(D) TOPOLOGY: linear 

Cxi) SZQUEZiCS DESCRIPTION: SEQ ID MO: 204: 

50 

CGGCCTTCCG GGGCAACCGT TCGTCCCAAC NCGGGAAAGG GTCCTGGAGN CGGGAACTAG 60 

GAGCCTCGGA AGTCCAAGGG CGGAGCGCCC TTTGCTAATA AGCCAATCAG AACGTGAGAC 12 Q 

55 GCTCCGGTGG GNCGGTGCCG TCGAGCGCGG GGTGGAGTCT GGGTGACTTG GCTGGCGGGA 130 

TCAAGTGCAG CTGCTTCAGG CTGAGGTGGC AGATAGTGAG CGCTGGTGGC GGAGTTAAAG 240 

TYAAAGCAGG AGAGTAAT^vA TGAA.TAGCGC AGCGGGATTC TCACACCTAG AC Z GTCGCGA 200 

60 
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GCGGGTTCTC AAGTTAGGGG AGAGTTTCGA GAAGCAGCCG G3CTCCGCTT OCACACTGTG 360 

CGCTATGACT TCAAACCFGC TTCTATTGAC ACTTCTTCTG AAGGATACCT TGAGKTTGGC 420 

5 GAAGKTGAAC AGKTGACCAT WACTCTGCCM AATA.TAGAAA GTTGAAGGAA GCAGTAAAAT 430 

TCAGTATCGT AAAGAAC-A.C AGCAACAACA ATGTGGAATT CASCGAGGAC TCCCAATCTT 540 

GTAAAACATT CTCCATCTGA AGATAAGATG TCCCCAC-CAT CTCCAATAGA -TGATATCGAA 600 

AGAGAACTGA AGGCAGAAGC TAGTCTAATG GACCAGATGA GTAGTTGTGA TAGTTCATCA 660 

GATTCCAAAA GTTCATCATC TTCAAGTAGT GAGGATAGTT CTAGTGACTC AGAAGATGAA. 720 

L5 GATTGCAAAT CCTCTACTTC TGATACAGGG MAA.TTGTGTC TCAGGACATC GTAC'GATGAC 730 

ACAGT ACAGG ATTCCTGATA TAGATGGCAG TCATAATAGA TTTCGAGACA ACAGTGGCCT 340 

TCTGATGAAT ACTTTAAGAA ATGATTTGGA GGTGAGTGAA TCAGGAAGTG AGAGTGATGA 900 

20 

CTGAAGAAAT ATTTAGCTAT AAATAAAAAT TTATACAGCA TGTATAATTT ATTTTGTATT 960 

AACAATAAAA. ATTCCTAAGA CTGAGGGAAA TATGTCTTAA CTTTTGATGA T^AAAGAAAT 1020 

25 TAAATTTGAT TCAGAAAA^A. AAAAAAAAAA AAGTCGA 1057 

30 (2) INFORMATION FOR SZQ ID NO: 205: 

(i) SEQUENCE CHAPACTEKISTICS : 

(A) LENGTH: 721 base pairs 

(B) TYPE: nucleic acid 
35 (C) STRAND tDNESS : double 

(D) TOPOLOGY: linear 

(:<i) SEQUENCE DESCRIPTION: SEQ ID NO: 205: 



40 GAATTCGGCA CGAGTCATCC CTCTCCCTCT TTCACTCCCT- TACTCTTACT CTGTTTTTTG 60 

TGCTCCAGAC AGACAGACCC TACCTCTTTT GCTTCTTTTT TGTTTGTTTG TTTTGAGATG 120 

GAGTGTC'GCT CTTGTTGCCC AGGCTGGAGT GCAGTGGCGC AATCTCGGCT CACGACAACC 130 

45 

TCTGCCTCCC GGGTTCAAGC AATTCTCCTG CCTCAGCCTC CCGAGAAGCT GGGGATTAGA 240 

GGCATGCGCC ACCAEACCCA GCTNAATTTT ATATTTTTAG TAGAGATGGT GTTTCTCCAT 300 

50 GTTGGTCAGG CTGGCCTCAA ACTCCCAACC TCAGGTGATN CCGCCTGCTT TGGCCTCCCC 360 

AAAGTGCTGG GATTACAGGC GTGAGC CACT GCC-CCCAGCC TCTTTTGCTC CTTTATACTC 420 

ATTAACTCAC GCCTGTAATC CCTGTTTTGG GAGGCCAAA.G TGAGAAGGTT GCTTGAGGCC 430 

55 

AAGAGTTTGA GACTASCCTG GGCL^ACACAG CAAGATGCCA TCTTTATAAT AAAA^JTAAAA 540 

ATAAAAATCA ATTAGCTGGG CATGGTGC-AA CGCACCTGTA GTCCCAGCCA ATTGAGAGGC 600 

60 TGAAGTGGGA GGATCATTGA GCCCAGGAGT TGAC-GTTGCA GTGAGC CATG ATGATGTCAC 660 
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TACACTCAGC CTGGGCAATA GAGGGACATG TTGTCTCTAA AAAAAAAAAA AAAAAACTCG 720 



(2) ESIFQRMATXOM FOR SSQ ID NO: 206: 



(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH : 2465 base pairs 
(3) TYPE: nucleic acid 
(C) STRANDEDNES3 : double 
15 (D) TOPOLOGY : '"linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 206: 

CCACCATTTA TCCAACTGAA GAGGACTTAC AGGCAGTTCA GAAAATTGTT TCTATTACTG 50 

20 

AACGTGCTTT AAAACTCGTT TCAGACAGTT TGTCTGAACA TGAGAAGAAC AAGAACAAA.G 120 

AGGGAGATGA TAAGAAAGAG GGAGGTAAAG ACAGAGCTTT GAAAGGAGTT TTGCGAGTGG 130 

25 GAGTATTGGC AAAAGGATTA CTTCTCCGAG GAGATAGAAA TGTCAACCTT GTTTTGCTGT 240 

GCTCAGAGAA ACCTTCAAAG ACATTATTAA GCCGTATTGC AGAAAACCTA CCCAAACAGC 300 

TTGCTGTTAT AAGCCCTGAG AAGTATGACA TAAAATGTGC TGTATCTGAA GCGGCAATAA 360 

30 

TTTTGAATTC ATGTGTGGAA CC-CAAAATGC AAGTCACTAT CACACTGACA TCTCCAATTA 420 

TTCGAGAAGA GAACATGAGG GAAGGAGATG TAACCTCGGG TATGGTGAAA GACCCACCGG 430 

35 ACGTCTTGGA CAGGCAAAAA TGCCTTGACG CTCTGGCTGC TCTACGCCAC GCTAAGTGGT 540 

TCCAGGCTAG AGCTAATGGT CTGO.GTCCT GTGTGATTAT CATACGCATT CTTOGAGACC 600 

' TCTGTCAGCG AGTTC CAACT TGGTCTGATT TTCCAAGCTG GGCTATGGAG TTACTAGTAG 660 

40 

AGAAAGCAAT CAGCAGTGCT TCTAGCCCTC AGAGCCCTGG GGATGCACTG AGAAGAGTTT 720 

TTGAATGCAT TTCTTCAGGG ATTATTCTTA AAGGTAGTCC TGGACTTCTG GATC=CTTGTG 730 

45 AAAAGGATCC CTTTGATACC TTGGCAACAA TGACTGACCA GCAGCGTGAA GACATCACAT 340 
CCAGTGCACA GTTTGCATTG AGACTCCTTG CATTCCGCCA GATACACAAA GTTCTAGGCA " 900 

TGGATCCATT ACCGCAAATG AGCCAACGTT TTAACATCCA GAACAACAGG AAACGAAGAA 960 

50 

GAGATAGTGA TGGAGTTGAT GGATTTGAAG CTGAGGGGAA AAAAGACAAA AAAGATTATG 1020 

ATAACTTTTA AAAAGTGTCT GTAAATCTTC AGTGTTAAAA AAACAGATGC CCATTTGTTG 1080 

55 GCTGTTTTTC ATTCATAATA ATGTCTACAT TGAAAAATTT ATCAAGAATT TAAAGGATTT 1140 

CATGGAAGAA CCAAGTTTTT CTATGATATT AAAAAATGTA C^GTGTTAGG TATTATTTGA I20G 

ATGGAAAGAC ACC-CAAAAAA AAAAATGTGC TCCGACTAGG GGGAAAACAG TAGTTCCGAT 1250 
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TTTTTCCCAT TATTTTTATT TTATTTTCTG GTTGCCCTAG CTTCCCCCCC TATTTTTGTG 1320 

TCTTTTATTA ACTAGTGCA.T TGTCTTATTA AATCTTCACT GTATTTAATG CAGGATGTGT 1380 

5 GCTTCAGTTG CTCTGTGTAT TTTGATATTT TAATTTAGAG GTTTTGTTTG CTTTTTGACA 1440 

CTAGTTGTAA GTTACTTTGT TATAGATGGT ATCCTTTACC CCTTCTTAAT ATTTTACAGC 1300 
AGTACGTTTT TTTGTAACGT GAGACTGCAC AGTTTGTTTT TCTATATGTG AAGGA.TTACA ' 1560 

10 

ACACAAAAAG TTATCCTGCC ATTCGAGTGC TCAGAACTGA ATGTTTCTGC AGATCTTGTG 1-520 

GCATTTGTCT CTAGTGTGAT ATATAAAGGT GTAATTAA.GA CAGAGTTCTG TTAATCTAAT 1530 

15 CAAGTTTGCT GTTAGTTGTG CATTAGCAGT ATAAAAGCTA ATATATACTA TATGGTCTTG 1740 

C\ACAGTTTT AAAGCCTCTG . CATAATTGA.T AATAAAAATG CATGACATTC TTOTTTTTAA 1300 

TAGACTTTTA AAATCATAAT TTTAGGTTTA ACACGTAGAT GTTTGTACAG TTGACTTTTT 1350 

GACATAGCAA GGCGAAAAA.T AA.CTTTCTGA ATATTTTTTT CTTGTGTATA AGTGGAAAGG 1920 

GCATTTTTCA CA.TAXAAGTG GGCTAACCAA TATTTTCAAA AGAACTTCA.T CATTGTAGAA. 1380 

25 CTAACAACAG TAACTAGCCC TTAATTATGG TGACAGTTCG TTATTGGTGT GTGTGAGATT 2040 

' ACTCTAGCAA CTATTACAGT ATAACACAGA TGATCTTCTC CACACAGCCC ATCACGCAGA 2100 



TAATTTACAG ■ TTCTGTTAAC AGTGAGGTTG ATAAAGTATT ACTGATAAAA AATTATCTAA 2160 

30 

GGAAAAAAAC AGAAAATTAT TTGGTGTGGC CATCTTACCT GCTTATGTCT CCTACACAAA 2220 

GCT.AAATATT CTAGCAGTGA TGTAATGAAA AATTACATCT TACTGTTGAT ATATGTATGC 2230 

35 TCTGGTACAC AGATGTCATT TTGTTGTCAC AGCACTACAG TGAAATACAC AAAAAA.TGAA 2340 

ATTCATATAA TGAGTT^AA.T GTATTATATG TTAGAA.TTGA CAACATAAAC TACTTTTGCT 2400 

TTGAAATGAT GTATGCTTCA GTAAAATCAT ATTCAAATTT A\AAAAAAAA AAAAAAAAAA 2450 

40 

CTCGA 2455 



45 

(2) nsIFGRHATION FOR SEQ ID MO: 207: 

(i) SEQUENCE CHAPACTEPJlSTICS : 

(A} LENGTH: 1430 bass pairs 
50 (E) TYPE: nucleic acid 

(C) STRAMDEENE3S : double 

(D) TOPOLOGY : linear 

(:<l) SZQWE21CZ DESCRIPTION: SEQ ID NO : 2Q7 : 

55 

GAATTCGGCA CGAGCTCAAG CTGGCAGGTG GTCGGGGGAG CGGCCGGAGA GGAGCTGCCG 60 
GGAGTTCGTG CCCTGCAGGA CATGACACCA GTGGCATATC ACGGCCATGG GGTCTCAGCA 120 
60 TTCCGCTGCT GCTCGCCCC? CCTCCTGCAG GCGAAAGCAA GAAGATGACA GGGACGGTTT 130 
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GCTGGCTGAA CGAGAGCAGG AAGAAGCCAT TOCTCAGTTC CCATATGTGG AATTCACCGG 240 

GAGAGATAGC ATCACCTGTC TCACGTGCCA GGC^ACAGGC TACATTCCAA CAGAGCAAGT 300 

5 

AAATGAGTTG GTGGCTTTGA TCCCVCACAG TGATCAGAGA TTGCGCCCTC AGCGAACTAA 260 

GCAATATGTC CTCCTGTCCA TGCTGCTTTG TCTCCTGGCA TCTGG T T T GG TG GTTTT CT T 420 

10 CCTGTTTCCG CATTCAGTCC TTGTGGATGA TGACGGCATC AAAGTGGTGA AAGTCACATT 430 

TAATAAGCAA GACTCCCTTG TAATTCTGAC CAT CATGGCC ACCCTGAAAA TCAGGAACTC 340 

OACTTCTAC ACGGTGGCAG TGACCAGCCT GTC CAGCCAG ATTGAGTACA TGAACACAGT 600 

15 

GGTGAATTTT ACCGGGAAGG CCGAGATGGG ACGACCGTTT TCCTATGTGT ACTTCTTCTG 660 

CACGGTACCT GAGATCCTGG TGCACAACAT • AGTGATCTTC ATC-CGAACTT CAGTGAAGAT 720 

20 TTCATACATT GGCCTCATGA CCCAGAGCTC CTTGGAGACA CATCACTATG TGGATTGTGG 73Q 

AGGAAATTCC ACAGCTATTT AACAACTGCT ATTGGTTCTT CCACACAGCG CCTGTAGAAG . 840 

AGAGCACAGC ATATGTTCCC AAGGCCTGAG TTCTGGACCT ACCCCCACGT GGTGTAAGCA 900 

25 

GAGGAGGAA.T TGGTTCACTT AACTCCCAGC AAACATCCTC CTGCCACTTA GG AGGAAACA 960 

CCTCCCTATG GTACCATTTA TGTTTCTCAG AACCAGCAGA ATCAGTGCCT AGCCTGTGCC 102Q 

30 CAGCAAATAG TTGGCACTCA ATAAAGATTT GCAGAATTTA ATACAGATCT TTTCAGGTGT 1030- 

TCTTAGGGCA TTATAAATGG AAATCATAAC GTGGTTCTAG GTTATCAAAC CATGGAGTGA 1140 

TGTGGAGCTA GGATTGTGAG TGACCTGCAG GCCATTATCA GTGCCTCATC TGTGCAGAAG • 1200 
35 • - 

TCGCAGCAGA GAGGGACCAT CCAAATACCT A^GAGAAftAC AGACCTAGTC AGGATATGAA 1260 

TTTGTTTCAG CTGTTCCCAA AGGCCTGGGA GCTTTTTGAA AA3AAAGAAA AAAGTGTGTT 1320 

40 GGCTTTTTTT TTTTTTAGAA AGTTAGAATT GTTTTTAGCA AGAGTCTATG TGGGGCTTGA 1330 

TTCACCCTTC ATCCATTGGC TGGAACATGG ATTGGGGATT TGATACAAAA ATAAACCCTG 1440 

CTTTTGATTC Ai^AAAAAAA AAAAAAWAAA AAAAACTCGA 1430 



45 



50 



(2) INFORMATION FOR SEQ ID NO: 208: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 372 case pairs 
(3) TYPE: nucleic acid 
(C) STHANDEDME33 : double 
55 .(D) TOPOLOGY : linear 



60 



(xi) SEQUENCE DESCRIPTION: SEQ ID MO: 208 : 
CAGTATTTCC CTCAGTACTG TAAGCAAAAG TGGTATGTTT TTCTTTCTTT ATGTCTACTC 60 
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TGTCCTCTGT GGCCTTCTGG TGTACCCCTC TCTTCCTAGC CATTCAGTCT GTCTAGTCAC 120 

CTCCCTAGTA GCTAGTGC7C TGTAAGTTTT TATTTAATTA GAACAACTCC ATTTCCATTT 130 

5 CAAGGTACC-T CAATGGGGGG AAAAGG CT CA TGATTTAAAG TGAAGTTAAC AAGACAGCTT 240 

TTAAAATGAA AACTCATACT CCAACTTCTA AAGTATATTT GAGCTGATTT GTTTCCAAAA 300 

CAAAGATATG CTGTACCTAA AAGTGGTAAA ACAAAAATAT AAAGACAAGG ACTAGGTGAT 3 50 

10 

TAAGGGGAGA GAAAAATCAT YTCTTTTCCA GGAAACCTTT GCTAAAATAA GCAAAACTTG 420 

AMTCTATGCT TCATGGAAAC TGACACAAAG AAAAGAAACT GATGGATTGC ACAGGCCTTG 480 

15 TTATAGAAAT AGATCTATAA AAAGAT CTGT CCACAGGAAA TATACACGTT CTCCTG3TTC 540 

TGAACTTCAA TGGGGATTTG TCACCTAGGT CTCCATCTAT AGGAATACCT TCACATACCT 600 

ATCTATTCAT GCACATATTC TGAAAACAGG TACATACAAA ATTAGAACAA AGGAAAAAAA 660 

20 

TTCTATTGAA CACTTAAAAA TAGAAACAGG GCAGGCACGG TGGCTCATGG TGTAATCCCA 720 

ACAATTTGGG AGGCTGAGGC TGGTGGATCA CGTGAGGTGA GGAGTGTGAG ACCAGCTTGG 730 

25 CCAAGATGGT GAAACCCCGT CACTACTAAA AATAOAAAA AAATTAGGCT GTGTGGTGGG 340 

ACACTCNTAC AATCCMGGCT GACTGGGGAA ' AM 372 

30 

(2) I>IFORMATION FOR SHQ ID MO: 209: 

(i) SEQUENCE CHAPJVCTE3XSTICS : 
35 (A) LENGTH: 1779 base pairs 

(3) TYPE : nucleic acid 

(C) STPAMDE3NESS : double 

(D) TOPOLOGY: Linear 

40 (xi) SEQUEMC2 DESCRIPTION: 5EQ ID MO: 209: 

AATTGCCAAG ACTGCAOAA ATTACAGTGG TAATGTA.TAT C^TTGGAGTT GACATAAAGA * - 50 

CAAAAGCATC TGTTATGAAA. TGAGTAGTAA TATTGGGTGG TTGATTTGTT CTTAGCAGAC 120 

45 

TTGGCTTCAT WTTGGTCTTG AGATAAAATG GCCAGCATAA ATGGTGTTTA TATTCACGTT ■ 130 

TTCCTAGGTG TGTGTGTGCA GGCCACAGCA GCATGCCCTT GGTGTAGTCA GTGCGGAAAS 240 

50 GGGTGTGTTC CTTCTTGAGC CTGCCTGCAG GGATGGTCTC CTTTTAAAGC AGGTTGTGTG 300 

CAGCATTCAG TACACTGAAG GTAAGGTAAA CCATCAACAT CTCTGGTGTT TTAAGATGTT 360 

ATTTTATTGG AACAACTGAC AAATGAGGGA TGTTAGCTTT GTGGCAGAA.T TCCCTGCATG 420 

55 

TGTGATAAGT GATGTTGTTT TATTTTTTGG CATTGCAACT GTGGCATAGT TACAATTTCT 430 

GTTTGKTCAT CACATTTAAA ATTGGKAGAG AACGCGCTTG AKGGATAGAG OGCCTTGAGK 540 

60 GTACTGTTTC TTATTAACTT TACTTTTTTT AAA.TCAACTT GCTATAGACT TTATATACAT 600 
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TTTGTTAAAT ATAGTTCCTA GTGACATAGA. AACGATGCGT AGTTTTCATT TACTAATTAC 660 

AAATGTTGAG GCCTAATTCT GAAAGTCCTC ATATTTAAAG GCTA.GACAAC GTAATGAAAT 720 

TTTTAACTAT TTGTATGTCA TTTTGAAAGT GTACTGCTTT ATGGTAAAAG TGTTTTTCAT 730 

TTGTTCATTG TTTTCATTAT TTGTGATCAT GTTGTCTTTC AA.TACA.GGCA. TAAA.CCTTCC 840 

10 ACTCTTGAAC AAAGCAGCTG CTTTTTAAAA GCGGTAATTG CTTCTTTACC TTTTATTTCT 900 

TTTGTAAATG AAGCTTTTCT TTAAGAATGT GACTTTAAAG TGTTGTCTAT TGCATAAAAC 960 

AGTTGACACT CACTTATTGT AAAGTGAA.GA TTGTTCTACT GCATGTGAAG TGGACCATGC 10? 0 

15 

AGATTTCTGT ATGTTCTCAG TATGCATCAC TAGATAATAA AGTCTTTTGT GAACAAGGCA 1080 

TTTGTAGCCA TTTTTAAAAG TTTTTGTCTT CAGTGCTGGT AAGTCAGGTA AACCATAAA.T 1140 

20 AGTTAAAAGC AACCTTTTGT TTTTTTCCTG AAAGTTTTTA ATTGAAAGTA TTATTAGTTA 1200 
AAGATGTAAA C GTAGCCAAA ATTACCAGTT TATTAA.TAAT TAGGA.TCCTA ATTATTTCAA ' ' ' 1250 

AAAATCGTAC AAATATTGTC AGCTTTCAGT GTAGTGAGAT TATTCCTGTA GGTTATGGGG 1320 

25 

TATAATTCAG GATTTAACTA ATGTTTCTGC TATTTTCTCA CTTTTCGTTT TCATGGTGCG 13 30 

GAAAGAGAAA AAGGAAAACG GGGCACAGGC CATTCGACGC CTTCTCCAAG GGGTCTGATT 1440 

30 TGCTGA.GA.CA CCAGCTTCAC CTTCTTAACA AGGCACCTAA TTACAACAAG CATGCACATT 1500 

TTGGTGCATT CAAGAATGGA AAATCAGAAT AGCAGCATTG ATTCTTCTGG TGCAGCTCAG 1550 

TGGAAGATGA TGACAACCAG AAGACATGAG CTAAGGGTAA GGGACTGTTC TGAAGAACCT 1620 

35 

TTCCATTTAG TGATCAAGA.T ATGGAAGCTG ATTTCTGAAA ATGCTCAGTG TGTACTCTAA . 1630 

TTATTTATGG TACCATTTGA A.TTGTAACTT GCATTTTAGC AGTGCATGTT TCTAATTGAC 1740. 

40 TTACTGGGAA ACTGAATAAA ATATGCCTCT TATTATCAA. 1779 



45- (2) IMFOEMAT iON FOR SEQ ID NO: 210: 



50 



(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 2110 base pairs 
(3) TYPE: nucleic acid 
(C) STRANDEDNESS : double 
. - ■ (D) TOPOLOGY : linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID MO: 210: 
55 GCGGCCGCTG CAGCCCGGAG CTGAGCTAGC CGTCCGAGCC GAGCCGTCCG AGCCGGGGAA 60 
GCCGGCGCGT GCTGCCGCTC GTGGCGGCCA GAGGAGAGGA GAGGCAGCAG CA.TGGCGAGT 120 
GTCCTGTCCC GACGCCTTGG ' AAA.GCGGTCC CTCCTGGGAG CCCGGGTGTT ' GGGACCCAGT 130 

60 
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GCCTCGGAGG GGCCTCGGCT GCCCCACCCT CGGAJGCCACT GCTAGAAGGG GCCGCTCCCC 240 



AGC GTTTCAG OVCCTCTGAT GACACCCGCT GCCAGGAGCA GGCCAAGGAA GTCCTTAAGG 300 



5 CTCCCACCAC CTCGGGCCTT CAGCAGGTGG CCTTTKftGCC TGGGGAGAAG GTTTATGTGT 360 



GGTACGGGGG TCAAGAGTGC ACAGGACTGG TGGWGCAGCA C-JGCTGGATG GA.GGGTCAGG 420 



TGACCGTCTG GCTGCTGGAG CAGAAGCTGC AGGTCTGCTG CAGGGTGGAG GAGGTGTGGC 430 

10 

TGGCAGAGCT GCAGGGCCCC TGTCCCCAGG CACCACCCCT GGAGCCCGGA GCCCAGGCCC 5 40 



TGGGCTACAG GCCCGTCTCC AGGAACATGG ATGTCCCAAA GAGGAA.GTCG GACGGATGGA . 500 



15 AA.TGGATGAG ATGATGCCGG CCATGGTGCT GAGGTCCCTG TCCTGGAGCC CTGTTGTACA 650 

. GAGTCCTCCC GGGACCGAGG CGAACTTCTC TGCTTCCCGT GCGGGCTGCG AGGCATGGAA 720 

GGAGAGTGGT GACATCTCGG ACAGCGGCAN CAGCAGTAGC AGGGGTCACT GGAGTGGGAG 730 

20 ' 

CAGTGGTGTC TCCACCCCCT CGCCCCCCCA CCCCCAGGCC AGCCCCAAGT ATTTGGGGGA 340 



TGCTTTTGGT TGTCCCCAAA CTGATCATGG CTTTGAGA.CC GA.TCCTGACC CTTTCCTGCT 90 Q 

25 GGACGAACCA GCTCGACGAA AAAGAAAGAA CTCTSTGAAG GTGATGTAGA AGTGCCTGTG 960 



GCCAAACTGT GGGAAAGTTG TGCGCTCCAT ' 4GTGGGCATC AAACGAGACG TCAAAGCCCT 1020 
CCATCTGGGG GACAGAGTGG ACTCTGATCA GTTCAAGCGG GAGGAGGATT TCTACTACAG ' 1080 

30 

AGAGGTGCAG CTGAAGGAGG AATCTGCTGC TGCTGCTGCT GCTGCTGCCG GAGAGGGGCA 1140 . 

GTC CCTGGGA CTCCCAGGTC CGAGCCAGCT CCCACCCCCA GCATGACTGG CCTGCGTCTG 1200 

35 TCTGCTCTTC CACCACCTCT GCACAAAGCG CAGTCCTCCG GG C CAG AACA TCCTGGCCCG 1260 

GAGTCCTCCC TGCCCTCAGG GGCTCTCAGC AAGTCAGCTC CTGGGTCCTT GTGGCACATT 12 20 

CAGGGAGATC ATGCAT AC GA GGCTCTGCCA TCCTTGGAGA TCCCAGTGTC ACGACACATC 1330 

40 

TACACCAGTG TCAGCTGGGC TCCTGCCCCC TCCGCCGCCT C-GTCTCT!<rrc TCCGGTCCGG 1440 

AGCGGGTCGC TAAGGTTCAG CGAAGGCGCA GCAGCCAGCA C'GTGCGATGA AATCTCATCT 1500 

45 GATCGTCACT TCTCCACCCC GGGGGCAGAG TGGTGCCAGG AAAGGCCGAG C-GGAGGCTAA 1560 



GAAGTGGCGG AAGTGTATGG CATGGAGCAC CGGGACCAGT GGTGCACGGG CTGCGGGTGG 16 20 

AAGAAGGGCT GCCAGCGCTT TCTGGACTGA GCTGTGCTGC AGGTTCTAGT CTGTTCOTGG ' 1530 

50 

CCCTGCCGGC AGGCACTGAC . AAGAGGCCAG TGTGTCAGCA GGCCTCAGCA GAAACCGAAA ' 1740 

GAGAAAGAAC GGAAACACGG AGTTTGGGCT CTGTTGGCTA AGGTGTAACA CTTAAAG-CAA 1300 

55 TTTTCTCCCA TTGTGCGAAC ATTTTATTTT TTAAAAAAAA GAAACAAAAA. TATTTTTCCC 1360 

CCTAAAATAG GAGAGAGCCA AAACTGACCA AGGCTATTCA GCAGTGAACC AGTGAGCAAA 1920 

GAA.TTAATTA CCCTCCGTTT CCCACATGCC CACTCTCTAG GGGATTAGGT TGTGCGTGTC 19.80 

60 
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AAAAGAAGGA ACAGC TCGTT CTGCTTCCTG 'TTGAGTCGGT GAAT7 CTTTG CTTTCTAAAC 
TCTTCCAGAA AGGACTGTGA GCAAGATGAA TTTACTTTTC TTAAAAAAAA AAAAJ^J\AAAA 
AAAAACTCGA. 

(2) INFORMATION FOR SEQ ID NO: 211: 

(i) SHQUEMCH CHARACTERISTICS: 

(A) LENGTH : 933 base pairs . 
(3) TYPE: nucleic acid 

(C) STRANDEDNE33 : double 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION : SEQ ID MO : 211: 
GGCACAGGAA AAAAAAGAAA ' AAAGAAAAAA GAAAAAAGTT TTTGTACCCA CAGATTAGCA 
TTTTCTTGAT GTTTGAAAAA AGTTTAAGCT ATGTCCTAAT TTAAAAATGA GCACAAACTA 
CTTAACAGAT GTCTGTTCCC TCTTCTCTTA CTTAAATTAT CTTTATTTTC ACCATCACCT 
CCCAGTGCCG AACACCTGAM CTCTGTGTTT ^ TGTGGTTGGA TCCTGOGTTG CCAAGTTC-CT 
ATTTGGTCAG TCCCTGGCCT GTGGGGCGGT CTCAGGAAGT GGCATGCTCT TCAMGHAC3GA 
TCGTTCATYT CCAGTATAA.C CAWTTTGTTA ATAATAGTTG ATAATTCCCA GCTTTT AC CA 
GATGARTTTT GACTTATTTT TCCTCCTTTG ACCTGTTCAA AGCTAACATA TCTCGGTCAG 
TTCGGAGAGG GTGGGGGATT TGA.GAATGTG AGGAGGAGTG GGGTTAGAAT GGGTTTGCCT 
ATCTGGGCAA GGAAAGAGTT CCTAGTCGAT TCGGCAGfcAT GACAAAATGA TTCCATGGAT 
AGAATCGTCC CATGTTGCTG GAACACCTCA CGTGTTGTGA ACGCCTTAAA TTCCTGCCAT 
CCCTTCTCTG ATTCCCCACC TCCCTGTAGT TTCCACAGGA TTTATCTCTC TGTACCCCCG 
TCCTCCAA.CT CTACTCTGTC AGCCTCTCCT CCATCC'CTTA CTTCCCTTCT AAATTCCAGG 
AGATGACCTC ACTTTGCAAA GCAAATTGGA GCCAC'CAAAT TGTAGCTCTC CTCGGTGGAA 
ACTGCATCTG TGCTCATCCC TGCACCTTCT TGCAGAAAGC CGCCCCCTCA GGCCAAGATG 
AGTGCCTGGC CCCCATGGGA GACTCAGACA CTTTGACCCC TTGTGACTTC AGCATCTCCC 
TCTTTAAAGA TTCTCTCCCA ACATTCAGTC GTGCTCGA 

(2) INFORMATION FOR SEQ ID NO: 212: 

(i) SEQUENCE CI-LARACTERI ST IC S : 

(A) LENGTH: 1551 base pairs 
(3) TYPE : nucleic acid 
(C) ST5ANEEZNE5S : double 
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(D) TOPOLOGY : linear 
(xi) SEQUENCE DESCPI3TICN: ScQ ID MO: 212: . 

5 AGGCTGGACT AMCMAGAG AACCAGGAGA GAAAGAAAGA TTTAAGAGAC TGAGTAATAT £0 

. „ TTTTTGACAG ATCATTTAAG AAACTGAGTA ATTTTTTTTT TCTCCAAAAG GGCATGGGTT I2Q 

TTTTTTTTGT TTTGTTTTTT CTCTATTTGG CACTTTCTAG GGATTGGTCT ATAAATTTTT 130 

10 

TGAAAGATCA TAGGATAAAT TTCTTTGTAG CAACTTCCTA TTTTAGTGTT TATGTTAGGG ' 240 

GARCCCCAFG TGTCCCTGCT GATACGCCAT TAGGGCCACT TCTCAGGCTC TGGCTACATC 300 

15 ATAATGCTTT TTTTTCTATC TTGCCAAAGT TTCCSSGAAAA TTKAKGTTTT CTAATTTTAA 3 50 

AAAAATTGGT TGTGGAGATG GGATGGGACC TCTTTATAAG CCCTGAAAAT AAGTGATTTN 420 

TTTTAAGTGC TATTCTGCTA TAAACCTGAT TCTCACTTTT TTCTGTAGAC AACAGTTTTT 430 

20 

TATAATATAT CTATTTTGTG TGGACATTAT TTCCTTTTAA CCAATACTGA AATTCGATAG 540 

TGTAWRCTTT CTCCACATTT TCTTTGATTA A.TACTTYCTT AAAATAGACA CTTG3ATTGG 5J0 

25 CACCAGCTGT CACCAATAAA GCTGCCCTGA ACA.TTGTCAA TCAATCCTGT TAA.CCAATTT S50 

GAGAATTTTT CTGGAATGCT TAGTTAGGGA itGAAATTGCT. GGGTTA.TAGG TATGAGTA.TG 720 

CTTGATATAC TTTTCTCCAG AATGTCTACA CCTGTGTGTA CACCACATCT CCAGAGATAG 7 30 

30 ' 

GGGAATCTTA TGTCCCTGCT AACTGCTCTC- GTTATTTAAT TTTCTGACAT TTGCCGCCGC 840 

CGCCGCCCCC TGCCCCCAAC ACACACATGG TATAAAGTGG TAGTTTCTTG TTTTAAATTG 9 30 

35 AACTTTTGAA TGATTTGAAT TTGGGCATTT CTTTGTATCC TGAGTTATTT TGGTTTCCCG 950 

TTATGTGAAT ATCCTTTTCC TATGCTTTAA CTACTTTTCT AATTTGTCCC TTTTTTJiGGT 1C20 

TATCAAATTC CAGGCCATTG TCTATTCCAT CGTCACTTTT GGGTATTGGA AACATCTTTC 1C30 

40 

CATTCTGTAG GCTGTCTGTT GAA.CATAAA.T CTTGATTTTT ATGTAA.TCAG ATTTTTCTCC 1140 

TTACGGTTAT GTTCTTGGAA TTTTATTTAA. GAAA.TCTTTT TCTATCCTGA GAC'CACA-AAA. 1200 

45 ATGTCCCCAC O.TTTTCTTC TGTTTCATAG TTTTGCCTTG TATGTTTAAT CCTTTAAGGC .1250 

ATGTGTAGTT CATTTTATAT GGTGTGAAA.T AGTTCTTATT CATTTATTCA A.GACATATTG 1320 

GTGGAGTGCC TGCTGATGGT AGT ACTCTT C AGAGTA.CTTT GTA.TATATTT GTGAACACAT 1330 

50 

ATTCTTGCCC TGGAAGCTTA TGTTGTCMTT CAAGGTAGAT CGMTA.CTCGG TTTCCACCTG 1440 

TTTTCTTCAG CCCTCAGGAT GAATTCC^CA AJTTTTACACA TAGCACCAGT TAAGGAATAG 1500 

55 GCTTTATTGG AGAAAAGGAA GGCTTATTAG ACCAGCATCA GCAAAAAAAA A 1551 

60 (2) INFOPJ'ATION FOR SZQ ID NO: 213: 
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(i) SEQUENCE CHASACTE3 1TSTICS : 

" (A) LENGTH: 997 "case pairs 
(3) TYPE: nucleic acid 

(C) STPANDEDNE S3 : double 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID V.Z : 212: 
AGACAGTCCT OAACAGAACC TAATCATGCT GGCACGCTA.-. 7OTCA0AC7T rTAC-CCTCCA 
GAACTGAGAG AACA.TAAACT CCAGTTGTTT AAGCTA.CCCA GJCGArGGTA. TTTGTTAGTA 
TAGCCCAAGC TAAGTCAGGT GGAAAGGCAG AAATA-TTTTG AGAAGA?.OCA OTTCTAjCAAA. 
AACAGAGTTG TTCTAAATGA AATGGCCAGA TATTTGATCT TCTTGATAGT ASTAGTTAGG 
AAAGTTTCAT TAAAC.ACCAC TTGJSCCAGCA CCCAGGCCTG CG--CGCTCAC- AACC-GC-AAC 
AAAAGCAAAT GATTTGAGGA ACAAAA.GAGT GGACACAGAG CGTCOGAiGAA. GATGGCTCCA. 
TCTTCTGAGA TGATCTTCTG AGATCA.TCAA TTTTCTGCAC CTGATGTC GT ACTGGAAGGG 
TAGTAGATAA GAGCAAAGAG ACTTC CTGAT CCTGTGG.-AA ATGCTGGA3C ICTGCTGAOG 
GAGAGGGTGA CACTGGGACG AACAGAAGGC CGGACATTTA TtTGTGGCAG ICCTTCTGCA 
.CCTGGGCCCT CTTCAGGCCT TGTACCTTGC ACTCCCCAGG CCAGIGTAGC ACCTGGTAAG 
CTGAAGTTAG GTATTTGAAG AGATAATTTG CCGGCAACAA AGAATTACTT AAAAiG-AAAA 
GGAAACCACT AAATTCCACT TGAGAAACCA GTTTGTTCA3 "7TTTGAGTTT 0GCAA-ATT7G 
AAACTTTGTC TTTGGCACCA TATGATTCTG TTAGATTAGG GCTGAOCAAG GCTAAGAGAC 
AGAGGTAGGT GTACCAGGTG CCAGTC-GTCA AGAATG AAA. 1- .-AC GTGTCAG .-GAGAGA7CA 
GTTTGTAATA ACCTAACAGT TTTCCTTGG3 TATTAGHPAA ;AAAA^AAAA. TTAGAA.TA.-A. 
ATGTGAGTGG CATGCAGGCA AGTACAGATA TC-GAAA.TGAA AGCCCTGTCT AGAAGTGG-A. 
GATTTGTTTG TTAATAAAAT TGATTGGGAT GACTCGA 

(2) INFORMATION FOR SEQ ID NO : 214: 

(i) SEQUENCE CHAJA.CTEPJlSTIGS : 

(A) LENGTH: 149q base pairs 
(3) TYPE: nucleic acid 

(C) STPANDEDNESS : double 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCP.IPTION : SEQ ID ITG: 214: 
GAATTCGGCA CGAGTGACCA CAGATATCTT TCGGTTTCAG CCTGACCAG-- ATC-CTOTOCA 



CTATGTTTTT TTTAATCGAT TGACATGTGA. TGAATG GAGA AATTTAGCCG 
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TTTTCCATG7 TTGTCATAGC TTCATCACGC ACGATGGAGG TCACTTC-.GG ACTATCGGGA 130 

GCGGCCTCAC GGACAGATC?. GTGAATTTCC TTTTCGTTTT TGTTGATGTA CCGGATTGTC 240 

5 GACTCGTTAA CATTGAGCTC ATGGCCAACA GCACTGTAAC TGATGCGTGA TTGGAOCTTA 300 

TCCAACACGC GGAKTTTCTC CGTAAGGSAM ATCAMGCTCT TCTTTCGGTT AGGAACACTC 360 

GGCARAP.CTT AAE.CACTACG CTTGGGGGCC ATTTTAGAAA GCAAAACGAG CCACAAAAAG 420 

10 

CAGAAAAAAA AGTGTGAGTA AACAGACTGM MGAMAGGACT CTTTGTTTAC AGCAOjG-GAG 430 

CTGCGACTAG AAGGCGGCGC TTCTCCCCAG TTGAAACTTC AGCTGGGAAC CTTACCTCCG ^540 

15 CCAACTCCAA ATTTTCAC G C TCTGCGCATG CCCGGGAAAS AAACCGCCAG AACAGTACCG 600 

TGATGATTGA TTTTAGGGTT ACAAATACAT TTTAGGAAGT AAGTGAATTT GGGATTACGA SoQ 

ATT AATGATT ■ AATGAAGGTC ACCTGTATTT CCATAGATAT GTAATTTTAT TTAAGCAGGT 720 

20 

TTATTATATT AAGGCGGSGA GGCAGCGCCG AAGACTACAA GTTCCAGCAT GCACCGCGTC 780 

CGGGCGGGTT CGGGCTCCCA GGGAGGGCTT CAGGGAGGGG AGCCOGGAGG CATCGGCCGG 940 

25 AAGTGTCGTA GGGCAACCAC GTAGTACTCT CTGCGCATGT GCAAAGCGCT CTCGGGOZ-CC 900 

GCCCTAGCTG CCGTCGCCGC CGCCGGGGCT CTATGGTCTG TCCCTAGAGC TTTGCGGTTG 960 

GAGGCGGCTG CTGCGGTCTT GTGAGTTTGA CGAGGGTCGA GCGGCAGCAA CATGGAGGAA 102 0 

30 

TTCGACTCGG AAGACTTCTC TACGTCGGAG G AGGAG GAGG ACTACGTGCC GTGGGGTGAG 1030 

CGATTCCGCC TGAGGCGAGA AGCGAATTGC CCCCCCCC:-.C GCCTCACGTG AGGCGGGCTC 1140 

35 TGCCCCCGCG GGCGTCTGCC CTGTGGCCCA GGTGGTCGAG GGGGGCTCCT GTTCTCGAGC 1200 

GTCCGCTCCC TCAGC-CCCCT CATCCTCGGC CGCTCCGGCC CGAGGGGTGT GCGCGTGGCG 1260 

GTTCTGTGCT CCCCTCGCGT TGGGCAGCTC CGGCCGCCGC CCCCTCTTGC AGGGCGGGAA 1320 

40. 

CGGCACATGG ACACGGCCCC TTGTCGCTAG ' GG ACGCTCGT CGGTC2.GCCC GGAACGAC-A 13 30. 

CGCTGCTTCA GAAGTCGGGG CGGCftGTTCG AGCCTTGGAA GTTTTTTTCA GCCCTGGCCC 1440 

45 GAGAGAGCTG CTGGCCAACA ACG CGTGGAA GATAGAGCTG TCCGNTCTCC GNCTGG 1496 

50- (2>- INFORMATION FOR SHQ ID NO: 215 : ■ 

(i) .SEQOENCZ CKAPACTERI3TIG3 : 

(A) LENGTH : 1308 base pairs 
(3) TYPE : nucleic acid 
55 (G) STHANDEDMESS : cicubia 

. (D) TOPOLOGY : linear 

Ui) SEQUENCE DESCRIPTION: 3EQ ID MGj 215: 

60 TTGGCANCNG GGAGAGGGAA AGAGGAGGAA ATGGGGTTTG AGGACCATGG CTTACCTTTC oO 
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CTGCCTTTGA CCCATCACAC CCOVTTTCCT CCTCTTTCCC TCTCCCCOCT GCCAAAAAAA 120 

AAAAAAAAGG AAACGTTTAT CATGAATCAA CAGGGTTTCA GTCCTTATCA AAGAGAGATG 130 

5 

TGGAAAGAGC TAAAGAAACC ACCCTTTGTT CCCAACTCCA CTTTACCCAT ATTTTATGCA 240 

ACACAAACAC TGTCCTTTTG GGTCCCTTTC TTACAGATGG ACCTCTTGAG AAGAATTATC 300 

10 GTATTCCACG TTTTT AGCCC TCAGGTTACC AAGATAAATA TATGTATATA TAACCTTTAT 360 

TATTGCTATA TCTTTGTGGA TAATACATTC AGGTGGTGCT GGGTGATTTA TTATAATCTG 420 

AACCTAGGTA TATCCTTTGG TCTTCCACAG TCATGTTGAG GTGGGCTCCC TGGTATGGTA 430 

15 

AAAAGCCAGG TATAATGTAA. CTTCACCCCA GCCTTTGTAC TAAGCTCTTG ATAGTGGATA 540 

TACTCTTTTA AGTTT AGCCC CAATATAGGG TAATGGAAAT TTCCTGCCCT CTGGGTTCCC 600 

20. CATTTTTACT ATTAAGAAGA CCAGTGATAA TTTAATAATG CCACCAACTC ' TGGCTTAGTT 660 

AAGTGAGAGT GTGAACTGTG TGGCAAGAGA GCCTCACACC TCACTAGGTG CAGAGAGCCC 720 

AGGCCTTATG TTAAAATCAT GCACTTGAAA AGCAAACCTT AATCTGCAAA GACAGCAGCA 730 

25 

AGCATTATAC GGTCATCTTG AATGATCCCT TTGAAATTTT TTTTTTGTTT GTTTGTTTAA 340 

ATCAAGCCTG AGGCTGGTGA ACAGTAGCTA CACACCCATA TTGTGTGTTC TGTGAATGCT 900 

30 AGCTCTCTTG AATTTGGATA TTGGTTATTT TTTATAGAGT GTAAACCAAG TTTTATATTC 9 60 

TGCAATGCGA ACAGGT AC CT ATCTGTTTCT AAATAAAACT GTTTACATTC ATTATGGGGT 1020 

ATGTATGACC TTCATTTTCC AAGAAATAGA ACTCTAGCTT AGAATTATGG ATGCTCTAAA 1030 

35 

ATGTCAGAAT GGGAACTCTC CTCGAAGTTC TCCCAAACTC AGAGACAGCA CTGCCTTCTC 1140 

CTAAATGATT ATTCTTTTCT CCCTGTTTTC TGGTATTTTC TAGGCATC-CT TCTCACCACA 1200 

40 GCCATAACCC TTTTTTACTT CCATTAGGCC GTATAACTGG NGGGACNGCT GGTCGGTATA 1250 

TAATACTGGT WCCAACAMAG GGGTTCTGGA TGTACACMAG GTTATCTT 1308 

45 

(2) INFORMATION FOR SEQ- ID NO: 21b : 

(i) SEQUENCE CHARACTERISTICS : 
50 (A) LENGTH : 1705 base pairs 

(3) TYPE: nucleic acid 
AC) STRANDEDWEoS : double 
(D) TOPOLOGY: linear 

55 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 215: 

TGGCCATGGA AGCGCTAGAA GGTTTAGATT TTGAAACAGC AAAGAAGGAT TTCCTTGGAT 60 

. CTGGAGACCC CAAAGAAACA AAGATGCTAA TCAOCAAACA GGCTG ACTGG GCCAGAAATA 120 

60 
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TCAAGGAGCC CAAAGCCGCC GTGGAGATGT ACATCTCAGC AGGAGAGCAC GTCAAGGCCA 130 

TCGAGATCTG TGGTGACCAT GGCTGGGTTG ACATGTTGAT CGACATCGCC CGC AAA.CTCG 240 

ACAAGGCTGA GCGCGAGCCC CTGCTGCTGT GCGCTACCTA CCTCAAGAAC- CTGGACAGCC 300 

CTGGCTATGC TGCTGAGACC TACCTGAAGA TGGGTGACCT CAAGTCCCTG GTGCAGCTGC 350 

AGTGGAGACC CAGCGCTGGG ATGAGGCCTT TGCTTTGGGT GAGAAGCATC GTGAGTTTAA 420 

GGATGACATC TACATGCCGT ATGCTCAGTG GCTAGCAGAG AACGATCC-CT TTGAGGAA.GC 430 

CCA.GAAAGCG TTCCACAAGG CTGGGCGACA GAGAGAAGGG GTCCAGGTGC TGGAGCAGGT 540 

15 CACAAACAAT QCCGTGGCGG AGACGAGGTT TAATGATGCT GCCTATTATT ACTGGATGCT 600 

GTCCATGCAG TGCCTCGATA TAGCTCAAGA TCCTGCCCAG AAGGACACAA TGCTTGGCAA 660 

GTTCTACCAC TTCCAGCGTT TGGCAGAGCT GTACCATGGT TACCATGCCA TCCATCGCCA 720 

20 

GACGGAAGAT CGGTTCAGTG TCCATCGTCC TGAAACTCTT TTCAACATCT CCAGGTTCCT 730 

GCTGCACAGC GTGGCCAAGG ACACCCCCTC GGGCATCTCT AAAGTCAAAA TACTCTTGAC 340 

25 CTTGGCCAAG CAGAGCAAGG CCCTCGGTGC CT ACAGGGTG . GCCCGGCACG CCTATGACAA 900 
GCTGCGTGGC CTGTACATCC CTGCCA.GATT GCAAAAGTCC ATTGAGCTGG GTACCCTGAC ■ 960 

CATCCGCGCC AAGCCCTTCC ACGACAGTGA GGAGTTGGTG CCCTTGTGCT ACCGCTGCTC 1020 

30 

CACCAACAAC CCGCTGCTCA ACAACCTGGG CAACGTCTGC ATCAACTGCG GCCAGCCCTT "1030 

CATCTTCTCC GCCTCTTCGT ACGACGTGCT ACAGC TGGTT GAGTTGTACC TGGAGGAAGG 1140 
35 GATCACTGAT GAAGAAGGGA TCTCCCTCAT CGACGTGGAG GTGGTGAGAG GGAAGGGGGA ■ 1200 

TGACAGACAG GTAGAGATTT GCAAAGAACA GCTCCCAGAT TCTTGCGGCT AGTGGGAGAG 1260 

- CAAGGGACTC CA.TCGGAGAT i>IAGGACCCGT TCACAGCTAA '3CTRAGGTTT G.AGCiiA.GGTG 1320 

40 

GCTCAPAGTT CGTGCCAGTG GTGGTGAGCC GGCTGGTGCT GCGCTCCATG AGCCGCCGGG 1330 

ATGTCCTCAT CAAGCGATGG CCCCCACCCC TGAGGTGGCA ATACTTCOGC TCACTGCTGC 1440 

45 CTGACGCCTC CATTACCATG TGCCCCTCCT GCTTCCAGAT GTTCCATTCT GAGGACTATG 1500 

AGTTGCTGGT GCTTCAGCAT GGCTGCTGCC CCTACTGCCG CAGGTGCAAG GATGAC CCTG 15 60 
GCCCATGACC AGCATCCTGG GGACGGCCTG CACCCTCTGC CCGCCTTGGG GTCTGCTGGG . 1520 

50 

CTGTGAAGGA GAATAAAGAG TTAAACTGTC AAAAAAAAAA AAAAAAAAAA. AAnAAAAAAA 1630 

AAAAAAAAAA AAAAAAAAAA AAAMA 1705 

55 



(2) IMFOf^ATIGN FOR SEQ ID NO: 217: 
60 (i) SEQUENCE CKAPACTERISTICS : 
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(A) LENGTH : 999 base pairs 
(3) TYPE: nucleic acid 

(C) STHANCEDMESS : double 

(D) TOPOLOGY : Linear 

5 

(:ci) SEQUENCE DESCRIPTION: SEQ 13 MO: 217: 
AGCAAATCAC CTTAACGATC TGGAATGAAA CTX3TGACCAG TGCCGCCCTG GGTGGTTCTC 50 
10 GAGAGACTGC CGTCTTCTTG TTTGGCCATA GGTGCTGGGG CCCCGGCTTC AGTCACTGTC 120 
TCAGACAGKA GTCCCGATAA GCAGATCACC AGTCCTCCAC TGTCCTTCCT GTCGGCCTTG 130 
CTGCATGAC-A AGATAGCTGC TTCCTCCCTC TTTTCCTACA CTGTAAATTA TTGTTTTACA 240 

15 

ATTGAGTGYC TTAATAATAG TYTACAAATA CTATGTATTT ATGCAAAACT GTTAAAGTTC . 300 
TCATCTGTTA TGATTGGATA. CTTGGTCTTG TCAGTAGTGG TCAGCA.TTGG GTTGTGAGCT 3 60 

20 TGTCCTACTC CATACGTGTT TATCCTGCTA TGCATTTTAC ATTGTGTGTT C\CATCTATT 420 
CCAAGGAGCG TTGCTAGAAA CAACACTGGC GGTTCCTGCA GGC CAGGC AG GCATTGGCCC 430 
ATGCTGTGTC GCJi.TAGGA.GC CAATGGAAAG AACGTAGCTT GGTCTGCTAG OCAGCCGTGG 540 

25 

GGTGGCGCAG GCCAGGCAGC CTCTGCACCA ^AGTCCAGCA CCTGCCCATT CCCCAGTCAC 600 
ACAATCATAC TCTTCTTTCA TAGAGATTTT ATTACCACCT AGACCACCCT AGTTTTCCTC 660 
30 TCTGTTAGTG TCCTGAGCTC TTTTGCAACA AAATGTAGGT ACAGACCAA.T CCCTGTCCCT 720 
TCCCCAATCA GGAGCTCCA.C ACCATGAGTT GTTTGGTTTT CCAGAAGCTG CCAGTGGGTT 73 Q 

CCCGTGAATT GCGTTAAGAT ATCGATGATK TTTTTTATTG TTTTTCTTCT TGTTTTTTTA 340 

35 

AATAATATAT TTAAAGGCAG TATCTTTTGT ACTGTGAATT TGCAGTAGAA GATGCAGAAT 900 
GCACTTTTTT TTTACTTCTG TTGGTGTGTA TTGTATATAG TOTGTGTGCT TCTTGTGATG 960 
40 AAAATAAACT TTTTCTTTAT AAAAAAAAAA AAAAAAAAC 999 



45 (2) INFORMATION FOR SEQ ID MO: 213 : 

(i) SEQUENCE CHAHACTERXSTICS : 

(A) LENGTH : 941 base pairs 
{3) TYPE: nucleic acid 
50 (C) STRAND EDNES 3 : double 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 213: 

55 GGCACGAGTA GCATTTCATT TAATCTGCAG GTATATTCTC CCAACAGTTT ATTGTCATGT 60 

GATGTCCTCA GCCAA.GATTG TPAGGC AC-AG AGGAGCTGTC CC AAC CT ACT ATACCACCGA 120 

GGCTGGAGAG ATOT -Ji'' M '' n TGGTATTAAA CTGGAGTCTC TCCATCCTTC A.CATTGTTGA 130 

60 
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TGTCCTCTGT AGCAAACCCG AAA_AGTC\GT GAjCAGAAGAT GCCCCTAGCG GTTTOAGCCA 240 

GAGAATGACA GCTCTGGTTT GGAjGAAAAGG CCCGGATGGT GGCTCTAGAA AGCCCATCCT 300 

5 TCTGCTCTTC TTTTTTCTCC CCCTTATATT CTGCTTTCAT TCATTCATTC ATTCATCAAA 260 

O.TTTGTTGA GCACCTATTA TGTGTCAAGC TCTGTGCTAG CCTCTGGAAA A.CCTGCCCTC 420 

ATGTAGCTCA CTGTGGAGTA GGAGAAACAA TGACTACACT ATGATAAGCA CGGGTTGTCA 430 

10 

GGGTCTCA.CA GAGCAGTGGC CCCTCATCCA GACCGATGAG GTCAAA.GAAG GCA.TC CAGGC 540 

GAGGATGGTG TCAGAGCTAA CTGAAGAATG AGAGGGAGCT GCAC CASCAG GGGTTGGAA.C -600 

15 ■ TGAAGGTGGC AGTGCCTGGA GTCTTC-ATTC CAGOGAGGG AGAGCAGTCT GTGAAAAGGC 650 

ACCAAGGGTG GGAGAGGGCA GAGCACATGG AGGAACTTGA GGTAGTTCTG GATGGCSCTG 720 fc 

GGGCAAAGCT AGAGAGGTAA GAAGAATCTA. CAAATGTTCG TCGAGTTACA TG AA.CTTC CA 730" 

20 

TCCCAA.TAAA CCCATTGGAA ACGAAAAATT TAAGTCAGAA. GTGCA.TTTAA GGCTGGTCCG 840 

AGTAGAATGA TTTTTACAAC GAATTGATCA CAACCAGTTA CAGATGTCTT TGTTCCTTCT 900 

25 CCACTCCCAC" TGCTTCACCT G ACTAGCGTT ' TAAAAAAAAA A '941 

30 (2) INFORMATION FOR SEQ ID NO: 219: 

(1) SEQUENCE CKAFACTE3.ISTIC3 : 

(A) LENGTH: 575 base pairs 
(3) TYPE: nucleic acid 
35 (CJ STRANDEDNESS : double 

(0) TOPOLOGY: linear 

(XL) SEQUENCE DESCRIPTION: SEQ ID NO:' 219: 

40 TAAGTGGAAT CCCCCGGGGT TGCAGGGAAT TCGGCACGAG GIATTCTGAG AAGCTTAAGA 60 

CATACTTTGA AGACAACCCT AGGGACCTCC AGCTGCTGCG GCATGACCTA CCTTTGCACC 120 

CCGCAGTGGT GAAGCCCCAC CTGGGCCATG TTCCTGACTA CCTGGTTCCT CCTGCTCTCC 130 

GTGGCCTGGT RCGCCCTCAC AAGAAGCGGA AGAAGCTGTC TTCCTCTTGT AGGAA.GGCCA 240 

AGAGAGCAAA GTCCCAGAAC CCACTGCGCA GCTTCAAGCA CAAAGGAAA.G AAA.TTCAGAC 300 

50 CCACAGCCAA GCCCTCCTGA GGTTGTTGGG CCTCTCTGGA GCTGAGCACA TTGTGGAGCA 3S0 

CAGGCTTACA CCCTTCGTGG ACAGGCGAGG CTCTGGTGCT TACTGCACAG CCTGAACAGA 420 

CAGTTCTGGG GCCGGCAGTG 'ITGGGCCCTT TAGCTCCTTG GCACTTCCAA. GCTGGCATCT 430 

55 

TGCCCCTTGA CAACAGAA.TA AAAATTTTAG CTGCCCCAAA AAAAAAAAAA AAAAAAAAAA. 540 • 

CTCGAGGGGG GGCCCGTACC C\ATTCGCCC TATAA 575 

60 
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(2) nJFOEl^TION FOR SEQ ID MO: 220: 

(i) SEQUENCE CKAEACrERISTlCS : 

(A) LENGTH : 3013 base pairs 
(3) TYPE: nucleic acid 
CO STP-AMDEDNESS : double 
(D) TOPOLOGY: linear 

<:ci) SEQUENCE DESCHZ2TION : SEQ ID NO: 220: 

GCCAGCCTTA CAGGTTTTAC GTGAAATGAA AGCCATTGGA ATAjGAACCCT CGCTTGCAAC 

ATATCACCAT ATTATTCGCC TGTTTGATCA ACCTGGfiGAC CCTTTAAAGA GATCATCCTT ' 

CATCATTTAT GATATAATGA ATGAATTAAT GGGAAAGAGA TTTTCTCCAA AGGACCCGGA 

TGATGATAAG TTTTTTCAGT CAGCCATGAG CATATGCTCA T CTCTC AGAG ATCTAGAACT 

TGCCTACCAA GTACATGGCC TTTTAAAAAC CGGAGACAAC TGGAAATTCA TTGGACCTGA 

TCAACATCGT AATTTCTATT ATTGGAAGTT CTTCGATTTG ATTTGTCTAA TGGAftCAAAT. 

TGATGTTACC TTGAAGTGGT ATGA3GACCT GATACCTTCA GCCTACTTTC CGGACTCCCA 

AACAATGATA CATCTTCTCC AAGCATTGGA "TGTGGCCAAT CGGCTAGAAG TGATTCCTAA 

AATTTGGGAA AGATAGTAAA GAATATGGTC ATACTTTCCG CAGTCACCTG AGAGAAGAGA 

TCCTGATGCT CATGGCAAGG X»CAAGCACC CACCAGAGCT TCAGGTGGCA TTTGCTGACT 

GTGCTGCTGA TATCAAATCT GCGTATGAAA GCCAACCCAT CAGACAGACT GCTCAGGATT 

GGCCAGCCAC CTCTCTCAAC TGTATAGCTA TCCTCTTTTT AAGGGCTGGG AGAACTCAGG 

AAGCCTGGAA AATGTTGGGG CTTTTCAGGA AGCATAATAA GATTCCTAGA AGTGAGTTGC 

TGAATGAGCT TATGGACAGT GCAAAAGTGT CTAACAGCCC TTCCCAGGCC ATTGAAGTAG 

TAGAGCTGGC 'AAGTGCCTTC AGCTTACCTA TTTGTGAGGG CCTCACCCAG AGAGTAATGA 

GTGATTTTGC AATCAACCAG GAA.CAAAAGG AAGCCCTAAG TAATCTAACT GCATTGACCA 

GTGAGAGTGA TACTGACAGC AGCAGTGACA GCGACAGTGA CACGAGTGAA GGCAAATGAA 

AGTGGAGATT CAGGAGCAGC aatggtctca CCATAGCTGC TGGAATCACA CCTGAGAACT 

GAGATATACC AATATTTAAC ATTGTTACAA A3AAGAAAAG ATACAGATTT GGTGAATTTG 

TTACTGTGAG GTACPJ3TCAG TACACAGCTG ATTTATGTAG ATTTAAGCTG CTAATATGCT 

ACTTAACCAT CT ATTAATGC ACCATTAAAG GCTTAGCATT TAAGTAGCAA CATTGCGGTT 

TTCAGP.CACA TGGTGAGGTC CATGGCTCTT GTCATCAGGA TAAGCCTGCA CACCTAGAGT 

GTCGGTGAGC TGACCTCACG ATGCTGTCCT CGTGCGATTG CCCTCTCCTG CTGCTGGACT 

TCTGCCTTTG TTGGCCTGAT GTGCTGCTGT GATGCTGGTC CTTCATCTTA GGTGTTCATG 
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CAGTTCTAAG ACAGTTQGGG TTGGGTCAAT AGTTTCGCAA TTTC-jGGATA TTTCGATGTC 1500 

ACAAA.TAACG CATCTTAGGA ATGACTAAAC AAGA.TAATGG CAGTTTAGGC TGCACAACTG 1550 

5 GTAAAATGA.C TGTAGATAAA TGTTGTAATT AjGTGTACACG TTTGTATTTT TGTTAATATA 1620 

GCCGCTGCCA TAGTTTTCTA ACTTGAACAG CCATGAATGT TTCATGTCTC CCTTTTTTTT * 1530 

TTGTCTATAG CTGTTACCTA TTTTAGTGGT TGAAA.TGAGA GCTAGTGATG ACAGAAGGAT 1740 

10 

GTGGAATGTC TTCTTGACAT CATTGTGTA.T TGCTGGTAAT CAAGTTGGTA ACGACTACTT 1300 

CTAGCAGGTC TTACCACTAT GACTTAAGTG GTCCTGGAAG GCAGTAAGTG GAGGTTTGCA 1360 

15 GCATTCCTGC CTTCATGAGG GCTTCTACCA CTGACCACTT TGCACGTACC TGGCTCCCAG 1920 

ATTTACTTAG GTACCCCACG AGTCGTCCAC ATAAGCAGCT TCA.TCTTTAG CTTGC C AG AG 1380 

TTGACAATTA TGGGATACTC TAGTCTACTT ATACTTGTGT TCCCATCTGT CTGCCATCCT 2040 

20 

CTGAAGGCCA. GGACCCA.GTC ATACATCCTT AGAAA.CCAAA GTATGGTTTT TGTTTTCTCT 2100 

TGGAATGTCA GGTCTTAAGG CATTTAATTG AGGGACAAAA AAAAAAAAAA GCCGATATAG 2 ISO 

25 TAGCTAGCTA CTTAAGCATC CATGGGTA.TT GCTCCATATC AAAGCAGATT TCCAGGACAG 2220 

AAAGAGTAAA. TTAGCCTTCA GTCTTGGTTT ' ACAGCTTCCA AGGAGAGCCT TGGSCA.CCTG 2230 

AAATGTTAAC TCGGTCCCTT CCTGTCTCTA GTTCATCAGC ACCTGCAGAT GCCTGACTCT -2340 

30 

TGTTAGCCTT ACTATTCAAT ACAGTCCTTA GATTCACGGT ATGCCTCTTC CTATCCAGGC 2400 

ACCTATTCTG AATCACCATG TTGCTCTGCA GCTAGAGTTG ATAGGAGAAA ATCCATTTGG 2450 

35 GTAGATGGCC TATGAATTTG TAGTAGACTT TCAAAATGAG TGATTTGTTA GCTTGGTACT 2520 

TTTAAGTTTG TGGTACAGAT CCTCCAAACC CATACTGTGA GCAA.TTAACT GCCTTGAACA 2530 

TAGAGAAAAA TTAAGGGCTC ACAGGATGAG TCTCCATTCT CTGTAAATGC TTATTTTATC .2640 

40 

ATAGTCTTTA GCCTCTAACT ATGAGTAAAA. TGTTGTCTTC GGCCGGGTGT GGTGACTCAG 2700 

ACCTGTAACC TCAGCACTTT GGGAGGCAGA GGTGGGAGGA TCACTTAGGT CCAGG AGTTC 2750 

45 GAGACTAGCC TGGGCAACAT AGTGAGACAC CGGATCTACA AAAAAA.TAAA AAGCCAGACT 2320 

GGTGGTATGT ATCTGTGTCC CAGCTAA.TTG GGAGGGTGAG ATGGGAGGAT TGTTTGAGCC 2330 
TAGGAGAGGG AGGTTGCAGT GAGCCGTGAT CGGAGCACTG C^CTCCAGCC TGGGCAACAG .- 2940 

50 

AGCAAGAGCC TGTCTTGGAG AAACCAGAAT TTTGGAAGAG OAATGGGGG TGAGTGCAGT 3000 

GGCTCATGCG TGTAATCC 3013 

55 



60 



(2) INFORMATION FOR SEQ ID NO: 221:- 
(i) SZQUEHCZ C:~1ARAGTS?.ISTIG3 : 
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(A) LENGTH: 963 base pairs 
(3) TYPE : nucleic acid 

(C) 3TRANDEDNE33 : double 

(D) TOPOLOGY : linear 

5 

(:ci) SEQUENCE DESCRIPTION : SEQ ID NO: 221: 
GGCACGAGGG CCGCGGGACA TCCAC 3GGGC GCGAGTGACA CGCGGGAGGG AGAGCAGTGT 60 
10 TCTGCTC-GAG CCGATGCCAA AAACCATGCA. TTTCTTATTC AGATTCATTG TTTTCTTTTA 120. 

TCTGTGC3GGC CTTTTTACTG CTCAGAGACA AAAGAAAGAG GAGAGCACCG AAGAAGTGAA 130 
AATAGAAGTT TTGCATCGTC CAGAAAACTG CTCTAAGACA AGCAAGAAGG GAGACCTACT 240 

15 

MAAA.TGCCCA TTATGACGGC TACCTGGGTA AAGACGGCTC GAAATTCTAC TGCAGGCGGA 300 
CACAAAATGA AGGCCACCCC AAATGGTTTG TTCTTGGTGT TGGGGAAGTC ATAAAAGGCC 360 
20 TAGACATTGC TA.TGACAGAT ATGTGCCCTG GAGAAAAGCG AAAAGT AGTT ATACCCCCTT 420 
CATTTGCATA CGGAAAGGAA GGCTATGCAG AAGGCAAGA.T TCCACCGGAT GCTACATTGA ' 430 

TTTTTGAGAT TGAACTTTAT GCTGTGACCA AAGGACCACG GAGGATTGAG ACATTTAAAC 540 

25 

AAATAGACAT GGAGAATGAG AGGCAGCTCT CTAAAGCCGA GATAAACCTC TACTTGCAAA 6QQ 
GGGAATTTGA AAAAGA.TGAG AAGGCACGTG ACAAGTCATA TCAGGATGCA GTTTTAGAAG 650 
30 ATATTTTTAA GAAGAATGAC GATGATGGTG ATGGCTTCAT TTCTCCCAAG GAATACAATG 720 
TATACCAACA CGATGAACTA TAGCATATTT GTA.TTTCTAC TTTTTTTTTT TAGGTATTTA 730 
CTGTACTTTA TGTATWAAAC AAAGTCMCTT TTCTCCMAGT TGc'A.TTTGGT ATTTTTGCCC 840 

35 

TATGAGAAGA TATTTTGATC TCCCCAATAC ATTGATTTTG GTATAATAAA JTGTGAGGCTG 90 Q 

TTTTGGAAAC TTAAAAAAAA ' ATTTAAAAAA AGTGGAGGGG GGGCCGTACG CAAOTCGCCG 960 
40 MATATGAT .963 



4} (2) I *> JFQ EMAT IQM FOR SEQ ID NO': 222: 



50 



(i) SEQUENCE CHARACT5PJ1STICS : 

■ (A) LENGTH: 1404 base pairs 
(BJ TYPE: nucleic acid 
IC) STRAMDEDNES S : double 
(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION:. SZQ ID MO: 222: 
55 CGTTTTCCGG CCGTGCGTTT GTGGCCGTCC GGCCTCCCTG ACATGCAGCC CTCTGGACCC SO 
CGAGGTTGGA CCCTACTGTG ACAGACCTAG CATGCGGACA CTCTTCAACC TCCTCTGGCT 120 
TGCCCTGGCC TGCAGCCCTG TTCACACTAC CCTGTCAAAG TCAGATGCCA PAAAAGGCGC I34) - 

60 
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CTCAAAGACG CTGCTGGAGA AG.AGTCAGTT T/TC.AGATAAG CCGGTGCAAG ACCGGGGTTT 
GGTGGTGACG GACCTCAAAG GTGAGAGTGT GGTTCTTGAG CATCGCAGCT ACTGCTCGGC 
AAAGGCCCGG GACAGACACT TTGCTGGGGA TGTACTGGGC TATGTCACTC CATGGAACAG 
CCATGGCTAC GATGTCACCA AGGTCTTTGG GAGCAAGTTC ACACAGATCT CACCCGTCTG 
GCTGCAGCTG AAGAGACGTG CGCGTGAGAT GTTTGAGGTC ACGGGCCTCC ACGACGTGGA 
CCAAGGGTGG ATGCGAGGTG TCAGGAAGCA TGGCAAGGGO CTGCAGATAG TGCCTCGGCT 
CCTGTTTGAG GACTGGAGTT ACGATGATTT CGGGAACGTC TTAGACAGTG AGGATGAGAT 
AGAGGAGCTG AGCAAGACCG TGGTCCAGGT GGCAAAGAAC CAGCATTTCG ATGGCTTCGT 
GGTGGAGGTC TGGAACCAGG TGGTAAGC CA. GAAGCGGGTG GGCCTGATCC ACA.TGCTCAC 
CCACTTGGGC GAGGCTCTGG ACGAGGGGCG GCTGCTGGCC CTCCTGGTCA TCCCGCCTGC 
CATOkCCCCC GGGAGGGACC AGCTGGGCAT GTTCACGCAC AAGGAGTTTG AGCAGGTGGC 
CCCCGTGCTG GATCGTTTCA GCCTCATGAC CTAGGACTAG TGTACAGCGC AT C AGCCTGG 
CCCTAATGCA. CCCCTGTCCT GGGTTGGAGC CTGCGTCCAG GTCCTGGACC CGAAGTCGAA 
GTGGGGA&3G AAAATCCTCC TGGGGGTCAA CTtCTATGGT ATGGACTACG CGAGGTGCAA 
GGATGG CCGT GAGCCTGTTG TCGGGGCCAG GTACATCCAG ACACTG^AGG ACGACAGGCO 
CGGGATGGTG TGGGACAGGC AGGYCTGAGA GGACTTCTTC GAGTACAAGA AGAGGCGGAG 
TGGGAGGCAC GTCGTCTTCT ACCCAA.CGCT GAAGTCGGTG CAGGTGCGGC TGGAGGTGGC 
CGGGGAGCTG GGG'GTTGGGG TCTCTATCTG GGAGCTGGCC AGGGGGTGGA CTACTTCTAC 
GAC CTGCTCT AGGTGGGGAT TGCGGCCTCC .GCGGTGGACG TGTTCTTTTC TAAGGCATGG 
AGTGAGTGAG CAGGTGTGAA ATACAGGCCT NCAGTGCGTT TGGTGTGAAA AAAAAAAAAA 
AAAAAAAAAA. AAAAAAAAAA AAAA 

(2) INFORMATION FOR SZQ ID NO : 223: 
(i) SZQUZXCZ CKAKACTERISTIC5 : 



(xi) SEQUENCE DESCRIPTION: SEQ ID MO: 223: 

NGCGGGCCTG CAGTCGACAC TAGTGGATCC AAAGAATTCG GCAGGAGGGC AGGTCGAGGG 

CTCAGAAATG AGCTCTATTG ACGAATTCTG CCGCAAGTTC CGGCTGGACT GCC CGCTGGC 

C ATGG AC-C GG ATGAAGGAGG ACCGGCCCAT CACCATC.-AG GACGAGAAGG GC AA.C CTCAA 



(A) LENGTH: 707 bass pairs 
(H) TYPE: *i U.C 1—i.c sci<i 

(C) STPANDEDMESS : double 

(D) TOPOLOGY : linear 
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CCGCTGCATC GCAGSCGIGG TCGGC-GG CTT GAGGAGGGTG A~<«^AAG-G TG-GGGGTGGA 240 

GATCCGGGGG ATGGATGAGA TCCAGGGGGA GGGGGGAG^G CTaAGG-GAGA C GAGGGACGG 300 

CATGAGCCAC CTCCCACCCG AGTTGGA2-G-G GGGGGAGACG GGG.-GGGAGT GGGGGGAGAC 3 50 

CCTGAGCGGC ATGTCGGGGT CAGATGAG-— GGAGGA1TGA GAGGTGGGGG AGAGGGTGGT 420 

10 CGACCTGGAG TCAGCCTACA gg rGCGGGAG- GGGGGG^Gir 430 

540 
600 
660 



15 



20 



50 



GTGTCAGAAC TTTT C 



GGGA-- -GGTG CA.GC— 


< 


GGGGGAC 


GGGATGGGTC CGG^iGGGGT 


err: 




- GCAGG _ G-G U.GTGGG---.G 


TCG' 
AC— 


GIGi'riXi 
JiCAAAA 



AAAAAAAAAA AAAAACTCHG GGGGGGGGGG • GGCCCAATCC CGGG2G: 707 



(2) £MFO RMKTXQN FOR SZQ ID £3G : 22^4 : 

25 ( i) SeQUEMGG CIHA^A2TG7G GG2GG : 

(A) LZ-r-GTK: 12 54 ca^s- pairs 
(3) TYPE: nucleic ic'ic 

(C) ST7AiMEHGiMZ5S : ccuble 

(D) T05OLCGV: lir.sar 

-30 

(:ci) SEQDSrCE DESCZPCIC:: : £ZQ ZD JZC : 224: 

GGGGAACTGC AGTGACAGGA GGAGTAAGA.G IGGGAGGCAG GAGAGAGGGG GGAGAGAGGT SO 
35 ATGGAGAGGG GGTTCMCGA GGGGAGAGAG GG-GAGAGTAG CA3GGGG22G GGGGGGAG-A ' 120 

TCCAGGGAGA GGAGCGGAAA CAGAAG.-GGG GGAGAA2ACG GGGGGAGGTG TGGGGTGGAG 130 
^ AGCGCCTCAG CCATGTTGGG AGG GAAGGGA GAGTGGCTAC C-GGTGGG GT ACAGAGTGGG 240 

GGGCTGCCCT TGGTTGTGGT GCTGCTGGC G CGGGGGGGGG GGGGGGGGGA G3AGGGGTCA 300 

GAGCCCGTCG TGGTGGAGGG GGAGTGCCTG GGGGTGTGTG AGGGTG-G-GGG AGGTG2TGGA 3 = 0 

40 GGGGGGCCCG GGGGAGGAGG CCTGGGAGAG GCAGGG2GTG GGGGAGGGGG .ATGG GCTGCG 420 



GTCGGAAGCC AMCACCATGA GGGAGGAGGG GAAACGGGGA AGGGGLAG GAEC TGGGGGCATC 430 
TACTTCGACC AGGTCCTGGT GA-GGAGGGG GGTGGGTTTG ACZGGG-GGTC TGGGTGCTTC " 540 

GTAGCCCCTG TCCGGGGTGT CTAGAGGTTG CGGTTG GAGG TGGTGAAGGT GGAGAAGGGG 60*0 

CAAACTGTCC AGGTGAGGGT GATGGTGAAG ACGTGG2GTG TGATCGGAGG - C77GGOGAAT €60 

55 GATC'GTGACG TGACGGGGGA GGGAG2CACG AGCTGGGTGG TAGTGGGGTT GGAGGCTGGG 720 

GACCSAGTGT CTCTGCGGCT GC~CGGGGG AATCTA2TC-G GGGGGGGGAA AG.--GTGAAGT 730 

TTCTCTGGCT TCCTCATCTT GGGGCTGTGA GGAG Z CAAG? TTTTCAAGCA CAAJIAATGCA 340- - 

60 
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GCCCCTGAGA ACTTTCTTCT GCCCTCTCTT GC CC CAGAAA C^GCAGAGGC AGGAGAGAGA 
CTCCCTCTGG YTCCTATCCC ACYTCTTTGC ATGGGAMCCT GTGCCAAACA CC CAAGTTTA 
AC^AAJ^AP.Y ARARCTGMGG CAjGGTA.TA.C-. GAGCTGGAAG TGGACCATGG AAAACATSGA 
TAACGATGCA TCiTCTTGGT TGGCCACCTG CTGAAACTGT CCACCTTTGA AGTTTGAACT 
TTAGTCGCTC CAMACTCTGA CTGCTGCCTC CTTCCTCCCA GCTCTCTGAC TGAGTTATVT 
TCACTGTACC TGTTCC-J3CA TATCCGCACT ATCTCTCTTT CTCCTGATCT GTGCTCTCTT 
ATTCTCCTCC TTAGGGTTCC TATTACCTGG GATTCCATGA TTCATTCCTT CAGACCCTCT 
CCTGCCAGTA TGGTAAACCC TCCCTCTCTC TTTCTTATCC CGCTGTCCCA TTGGCCOVGG 
GTGGATGAAT CTATCAATAA AAC^ACTAGA GAATGGTGGT CAAAAAAAAA AAAAAAAAAC 
TCGA 

(2} INFORMATION FOP. SEQ ID NO: 225: 

(i) SEQUENCE CKAKACTEHISTICS : 

(A) LENGTH: 7 SO bak<~ pairs 
(3) TYPE: nucleic acid 

(C) ST8ANDEDNESS : double 

(D) TGPOLCGY: linear 

(xi) SEQUENCE DESCRIPTION: SHQ ID MO: 225: 
GGGTCGACCC ACGCGTCCGC TGACCAGTCC GTTATAGATA CTTCTTCCTA TACCAAAACT 
GTTTAAACAG GTGCCACCAC AAGGGATGTC GTCCTTACTC TCTGCGGGTC . TTCAAGCA.TC 
CCTTTGTGGG AAAPjGTCTCT GOGCPiAGCAC GTGGTATTTG GTCTGCTGCT TGCTTCCCTT 
TTTCCACCAG GGATGTTGTG ATCATAAGTC AAAACAAC^vG TA.TATTCC-A ATCTCAAAA3 
CTATTGTGGC CTGAGCACAA TTCAAATCTA GCAGAGTTTT TCCTATGTAG CTTTAGAGTA 
ACTCTTCTGC TTCTCTGTCA CTTACAATTC AGGTTCTGCC TTTGCCTAAG ACCATGAGCk 
GAAGAGTCCT CATGTGACGC TTAGTTCTAT TGCAGTCCTG GGTGAAACTA TTTAA.GCWAT 
GGGGCTGCTK CTCCCCANWT CCTCCCTAAC AATTCGTTGT GTGGACTTCT CATCTAAAAG 
GTTAGTGGCT TTTGCTTGGG ATCAGTGCTC TCTATTGATG TTCTTGCTGG TCTCCAGACA 
CATTCCTGTT GCATTAAGAC TTGAAAGACT TGTAGATGTG TGATGTTCAG GCACAGGATG 
CTGAAAGCTA TGTTACTATT CTTAGTTTGT AAA.TTGTCCT TTTGATACCA TCATCTTGTT 
TTCTTTTTGT AGGTATAAA.T AAAAACACTG TTGACAATAA AAAAAAAAAA AAAAAAAAAA 
AAAAAAAAAA AAAAAAAAAA NAAAAAAAAA AA*J\AAAAAA 
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FOR SZQ ID NO: 226: 
{i> SEQUESG 



(A) LHNGTK : 2057 

(3) TYPE: nucleic acid 

(C) STRAMDEDNESS : double 

(D) TOPOLOGY : linear 



Cxi.) SEQUENCE DESCRIPTION: SZQ IE- NO: 22S : 

CCGAGCCGGC TGCGCCGGGG GAATCCGTGC GGGCGCCTTC CGTCCCPGTC CCATCCTCGC . SO 

15 CGCGCTCCAG CACCTCTGAA GTTTTGCAGC GCCCAGAAAG GAGGC3AGGA AGGAGGGAGT 12Q 

GTGTGAGAGG AGGGAGCAAA AAGCTCACGC TAAAACATTT ATTTCAAGGA GAAAAGAAAA 130 
AGGGGGGGGG CAAAAATGGC TGGGGCAATT ATAGAAAACA TGAGCAjCGAA GAAGCTGTGC ■ 240 

20 

ATTGTTGGTG GGATTCTGCT CGTGTTCCAA ATCATCGCCT TTCTGGTGGG AGGCTTGATT 300 

. GCTCCAGGGC CCACAACGGC AGTGTCCTAC ATGTCGGTGA AATGTGTGGA TGCCCGTAAG 360 

25 AACCATCACA AGACAAAATG GTTCGTGCCT- TGGGGACCCA ATCATTGTGA CAAGATCCQA 420 

GACATTGAAG AGGCAATTCC AAGGGAAATT GAAGGCAATG ACA7CGTGTT TTCTGTTCAC 420 

ATTCCCCTCC C C CACATGG A. GATGAGTCGT TGGTTCCAAT TCATGI-TTGTT TATCCTGCAG 540 

30 

CTGGACATTG CCTTCAAGCT AAACAACCAA ATCAGRGAAA. ATGCAGAAGT CTCCATGGAC ' 600 

GTTTCGCTGG CTTACCGTGA TGACGCGTTT GGTGAGTGGA CTGAAATGGC CCATGAAAGA 66 Q 

35 GTACCACGGA AACTCAAATG CACCTTOACA TCTCCGAAGA CTCCAGAGCA TGGAGGGCGG 720 

GTTACTATGA ATGTGATGTC CTTCCTTTCA TGGAAATTGG GTCTGTGGGG CA.TGAAGTTT 730 

TACCTTTTAA ACATC CGGGT GCGTGTGAAT GAGAAGAAGA AAATCAATGT GGGAATTGGG 340 

40 

GAGATAAAGG ATATCCGGTT GGTGGGGATC GACCAAAATG GAGGCTTCAC GAAGGTGTGG 900 

TTTGCCATGA AGACCTTCCi! TACGGCCAGG ATCTTCATCA TTATGGTGTG GTATTGGAGG -960 

45 AGGATCACCA TGATGTGGCG ACGCCGAGTG CTTCTGGAAA AAGTCATCTT TGCCCTTGGG 1020 

ATTTCCATGA CGTTTATCAA TATCCCAGTG GAATGGTTTT CCATCGGGTT TGACTGGACC 108 0 

TGGATGCTGC TGTTTGGTGA CATCCGACAG GCATCTTCTA TGCPATGCTT CTKTCCTTCT 1140 

50 

GGATCATCTT CTGTGGCGAG CACATGATGG ATCAGCACGA GCGGAACCAC ATCGCAGGGT 1200 

• ATTGGAAGCA AGTCGGACCC ATTGCCGTTG GTCCTTCTGC CTCTT CAT AT TTGACATGTG 1260 

55 TGAGAGAGGG GTACAACTCA CGAATCCCTT CTACAGTA.TC TGGACTACAG ACATTGGGAA 1320 

CAGAGCTGGC CATGGCTTTC ATCATCGTGG CTGGAATCTG CCTCTGCCTC TAACTTCCTG 1330 

TTTCTATGCT TCATGGTATT TCAGGTGTTT CGGAACATCA GTGGG AA.GCA GTCCAGCCTG 1440 
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co.2-~.ra-. ggaaagtcgg g7g-gttaga:: ta-Ggaggggc taa:ittttac- gttoagttc 
crcATC-crrA TCAcrrroc-r rrsrscrsc:: .-.ri-^r ca tcttcttcat cgttagtcag 

GT.---.CZ-3.-A1- GCCATTGGGA A-GGGGC-CGG CGTCATA^rC CCAAGTGAA.C AGTC-CCTTTT 

TC-.CAc-G-c-.r — -.rc-GGArr -ggaatgtgt rr^GATcrrc ttgtatgcac 

CArCCCA-TAA PAA-CTATC-GA GAAGACCAGT CCAA:GGA^T C-CAACTCCCA TGTAAATCGA 

C^-A^GATTG rGCTTGGTTr G7GTCGGAAC TGGAGGAA.GA A.GTGTTCAGC GCTTCGAAA.T 

ATTGCCTCAr C-A-GaACAAC G7AGCT7CTG GTArGTGAGT CAACAAGGCA ACA.CATGTTT 

ATC-JGGTTTG' CAZTTGCAGT 777CACAGTC ACATTGAGGG rA.CTTGTA.TA CGCA.CACAAA 

TACACrCATT GAGCC— TAT rrCA-AACGT GAAAGA-GAAG CI-AAAAAjGCG TCAA.CAA.TAA 

ATAiTCTTTG \aGTACTGTGT 7ACCTCTC7T .-AAAAAAAAA AAAAAAACTC GTGCCGAATT 
CGGCACGAGC GGGACGA 

(2) i:30?:C-~3M 72?- ZZZ ZZ VC : 227: 

(A) Z^HGZr.: 2Q3-. caae ?air3 
(5i TV?Z: nucl-ir acid 

( Z) — Airrznz^sz . dcurie 
cci} 5z;uz:;<cz r^c?-?Tz:::: gg; re :;o : 227 : • 

GC-tA;G.--GC-G2 CATTTC CTGC A-AjGAGCCAA A.CCZCCArTC CTCTCTC-CCC CTCCTCTCCC 
ACCAAGTGC7 TZAGA-AAA-T AGCTCTTGTT ACGGG.-AA.rA ACTGTTCATT TTTCACTCGT 
CCCrGGTAGG TCACA l ' l ' l CAGA-AAAGA, A-TCGC-CAGCC. TC-GPAACCAjG AAGAAAAATA 
TGAGAC GC-GG AAGCArCGTG VGAGGTG7GT SCZZZ-ZZZZZ C-GTGAGTGTG TGGAGTCCTG 
CTCAC-GTGTT AGGTACAGTG 7GTTTGA7C G TC-GrC-C-CGTG AGGGCAACCG CTTGTTGAGA 
GCTGTGACTG C^GCZC-GACT GCAGAGAAGC TGCCCTTC-GC TGGTCGTAGC GCCGGGCCTT 
CTCTCGIGGT. CATCArCGAG .-G-CAGCCAGT GTG CGGGAGG O.GAAGGTAC CGGGGCAGCT 
ACTGGAGGAC CGTGCGGGCC 7GCCTGGGCT GCCCCCTCCG CCGTGGGGCC CTGTTGCTGC 
TGrCCATCTA -TTTCTACTA.C 7GGGGCGG.-A ATGGGGCCGG CCCGCOCTTC AC TTGG ATGC 
TTGCCCTCCr GGGCCTTCTTC GGAGGCAGTG .-AJGAGCCTCC TGGGCCTOA. GGGCCTGGCC 
OCAGOTGAGA TGTCTGCAGT GGGTGAAAAA. GC<AATCTCA ACGTGGCCCA TGGGCTGGCA 
TGGTCAjXATT AGATCGGAGA rCTGCGC-CTG ATCGTGCCAG AGCTCCAGGC CCGGATTCGA 
ACTTACA-.TC AGGATTACAA. CAACCTGCTA CGGGGTGCAG TGAGCCAGCG GTGTNATATT 
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CTCCTCCCAT TGGACTGTGQ GGTGCCTGAT AAC CTGAGT A. TGGCTGACCC CAACATTCGC 340 

TTCCTGGATA AACTGCCCCA GCAGACCG3T GACC3TGCTG CCA-TCAAGGA. TGGGGTTTAC 900 

5. 

AGCAACAC-GA. T CT ATGA.GCT TCTGGAGAA.G GGGCAC-CGGG CGGGCACCTG TGTGCTGGAG 560 



TACGCCACCC CCTTGCAGAC TTTGTTTGCC ATGTCACAAT ACAGTCAAGC TGGGTTTAGG 1020 
10 GGGGAGGATA. GGCTTGAGCA GGCCAAACTC TTCTGCCGGA CACTTGAGGA. CATCCTGGCA 1080 



GATGCCCCTG AGTCTCAGAA CAACTGCCGC CTCATTGCCT ACCAGGAA-CC TGCAGA.TGAC 1140 

AGGAGCTTCT CGCTGTCCCA GGAGGTTCTC CGGCACCTGC GGCAGGAGGA AAAGGAAGAG 1200 

15 

GTT ACTGTGG GCAGCTTGAA GA.CCTCAGCG GTGCCCAGTA CCTCCACGAT GTGGGAAGAG 1250 

CCTGAGCTCC TCATCAGTGG ' AATGGAAAAG CCCCTCCCTC TCCGCACGGA TTTCTCTTGA 1320 

20 GACCCA.GGGT CACCAGGCCA GAGCCTCCAG TGGTCTCGAA GCCTCTGGAC TGGGGGCTCT 1330 

CTTCAGTGGC TGAATGTCCA, GCAGAGCTA.T TTCCTTCCAC AGGGGGCCTT GGAGGGAA.GG 1440 

GTCCAGGA.CT TGACATCTTA AGATGCGTCT TGTCCCCTTG GGCCAGTCAT TTCCCCTCTC IS 00 

25 

TGAGCCTCGG TGTCTTCAAC GTGTGAAATG 4SGATCATAAT CACTGCCTTA CCTCCCTCAC 1550 



GGTTGTTGTG AGGACTGAGT GTGTGGAAGT TTTTCATAAA CTTTGGATGC TAGTGTACTT 1520 
30 AGGGGGTGTG CCAGGTGTCT TTGATGGGGG CTTCCAGACC CACTCCCCAC CCTTCTCCCC 1630 
TTCCTTTGCC GGGGGACGCC GAACTCTGTC - AATGGTA.TCA ACAGGCTCCT TCGCCCTCTG 1740 



GGTCCTGGTC ATGTTCCATT ATTGGGGAGG CCCAGCAGAA GAATGGAGAG GAGGAGGAGG 1300 
CTGAGTTTGG GGTA.TTGAA.T CCCCCGGCTC CCACCCTGCA GCATCAAGGT TGCTATGGAC 1360 



TCTCCTGCCG GGCAACTCTT GCGTAATCAT GACTATCTGT AGGATTCTGG CACGACTTCC 1920 
40 TTCCCTGGCC CCTTAAGCCT AGCTGTGTAT CGGCACCCCC ACCCCACTAG AGTACTCCCT 1930 



CTCACTTGCG GTTTCCTTAT ACTCCACCCC TTTCTCAACG GTCCTTTTTT AAAGCACATG 2040 
TCAGAJITAAA. AAAAAAAAAA AAAAAAAAAA ACCCGGGCC^ GCNT 2034 

45 



(2) INFORMATION FOR SEQ ID MO: 223: 

50 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 2143 base pairs 
(3) TYPE: nucleic acid 
(C> STRANDEDNE3 3 : double 
55 CD) TOPOLOGY : linear 

(xi) SEQUEMCH DESCRIPTION: SZQ ID NO: 223: 

TCGA.CC CA.CG CGTCCGGTTG AATTCCTTGA CCTGCAAACA CATA.TTTA.TT AGCCTGACTC 50 

60 
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AAAOATGAA gctaxtaaaa cttcggagga ac^ttgtaaa actctctttg TATCGGCATT 120 

TGACGAACAG GCTTATTTTG GCAGTGGGAG CATCO.TTGT GTTTATCATC TGGACAACGA 130 

TGAA.GTTCAG AATAGTGA.CA. TGTCAGTCGG ACTGGCGGGA GCTGTGGGTA GAGGA.TGGCA 240 

TCTGGCGCTT GCTGTTCTCC ATGATCCTCT TTGTCATCAT * GGTTCTCTGG CCACCATCTG 300 

GAAACAAGGA. GAGGTTTGCC TTTTCACCAT TGTGTGAGGA AGAGGAGGAG GATGAACAAA 360 

AGGAGCCTAT GGTGAAAGAA AGGTTTGAAG GAATGAAAAT GAGAAGTACC AAACAAGAAC 420 
GCAATGGAAA. TAGTAAAGTT AACAAAGGAG AGGAAGAGGA TTTGAAGTGG GTAGAAGAGA ' 430 
15 ATGTTCCTTC TTCTGTGAGA GATGTAjGCAC TTCCAGCGCT TCTGGATTCA GATGAGGAAC ' 540 

GAATGATCAC ACACTTTGAA AGGTGGAAAA TGGAGTAAGG AATGGGAAGA TTTGCAGTTA 500 

AAGATGGCTA CCATCAGGGA AGAGATCAGC ATGTGTGTCA GTCTTGTGTA CGGGTGCATG 660 

20 * 

GGATTAAA.GG AAGGAA.TGAG A.TGCTGATCT GTTCCTTGAT CTTTGGGGA.T TGGAGTTGGC 720 

GAGAGGTGTG AGAACAAAGA. GAAGATGTTA. CTGAAAACAA GTTGATAAGA- TGAGAAAAA.T 730 

25 CTAGGAGGTT CTTATTTAGA ACACTGCTGC CCCCTTTCCT G CG AG ACT CT GACA.TGGATG 840 

TTCATGCAAC TTAAGTGTGT .TGTTCCTGAA CTTTCTGTAA TGTTTCATTT TTTAAATCTG 900 

ACAAACTAAA- AAGTTTAACG TCTTCTAAAA GATTGTCATC AACACCATAA TATGTAATCT 960 

30 

CGAGGAGG AA CTGCCTGTAA TTTTTATTTA TTTAGGGAGT TACATAGGTG ATGGGGGAAA 1020 

TTGTTAACTA CGTTTCATTT TCCTGGGAAG TCAAGGTTAC ATCTTGCAGA GGTTGTTTTG 1030 

35 AGAAAAAAGG GCCCTTCTGA GTTAAGGAGG CATAGTTCTA. TCAATGATCA AAAGAAAAAA 1140 

AAAAAAAAGA GAAACTGTTA CAGTATGATT CAGATGATTT AAAAAAGGAA AATCAAGTGC 12Q0 

AATTTTGTTT AGAAATGGTG TATATTAAAG ATTTTTCTAT TTCAGA.TGTA CTTTAAAGAG 1250 

40 

AAA.TATTAGC TTAACTCTTT TGACATCTGG TATTGTGACA CATCGCATTG CTGGCAATGT 1320 

GGTGGAGACT CGGAAA.CTTT TAA.CTACTGT TTTGTAAGCC TCCAAGGGTG GGATTGCAGG 1380 

45 GTCCTTAGGG AATGTTTTGT TTGGCTTTAT GCAGAGAGGT GCTCCAAGTG CTGTGATTGA 1440 

GCACCGTGCT AGAGGAAGTG TAATGCTTCA GAAGTTGTAG CTTATACAAA GGAAACAGGT 1500 

CCTGCTGGCT TAATTTAAAC AGTTATTGCA. TGAAGTAGGG TGGAGGGCCT GGACTGGTGG 1560 

50 

TCGTTCTTTA GGATGGACTG TTGTGGTA.TG TGGTATTGGT TTAGAGACTG TTAATAAGGG 1620 

ACATCACAAG GTGATGGGAT TCATTTGAAG CACTCTATTT CTGTTTTAAT GGTTTTATCG 1530 

55 AATTTTGGCT TCGCAAGATT TTTGTTCTAC ATAAAAAGTT CATGGGACTT TTTAATATAA 1740 

AAAAATTTAA CAAAATTAAT GTATTTTTCT CATTTTTTTC AAACTTTTTC TAAAGACTCT 1300 

TTCTGTCAAA CTCATC-AAAA ATTTCTTTCT ATGGGTTTTA TTCTAGATTG TCTTATTTTC 1360 



60 
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TGTTAAAA.CC AATGACCACA TGACCACAAT CTTCACTAAC TCATACTGCA. GTGAAAGTGT 1920 

TAA.CCCTTAG GTAGTTTCTC TACA ACTCTT TGCTATGGTG ATTTTTA AAA AAGTTTCCTA. 1930 

5 GGG AAGTATC TCTGAGGGAA C-GGCAATCT GAAjCGAACTG ACTATA.TTCT CCA.TGGCTAA. 2040 

GTCCATTAGG CCAAAAQMCT GGGTGGGTAT TGGTTGTCAM GCTGTCTATT GGC A.TA.TTAA 2100 

AAACGTAGGC CGGANGGAAT AATTAGGTTG TNATGCCGGC GGG 2143 

10 



(2) INFORMATION FOR SEQ ID MO: 229: 

15 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 102S base pairs 
(3) TYPE: nucleic acid 
(C) STPANDEDNE33 : double 
20 - (D) TOPOLOGY: linear 

(;ci) SEQUENCE DESCRIPTION : SEQ ID NO: 229: 

CCTGGCCCAC ATTGCTTCAT TGGCCTGGCC ATGCGCCTGT ACTATGGCAG CCGCTAGTCC 50 

25 

CTGACAACTT CCACCCTGAT, TCCGGACCCT dTAGATTGGG CGCCACCACC AGATCCCCCT 120 

CCCAGGCCTT CCTCCCTCTC CCATCAGCAG CCCTGTAA.CA AGTGCCTTGT GAGAAAAGCT 130 

30 GGAGAAGTGA GGGCAGCCAG GTTATTCTCT GGAGGTTGGT C-GATGAAGGG GTACCCTAGG 240 

AGATGTGAAG TGTGGGTTTG GTTAAGGAAA TGCTTACCAT CCCCCACCCC CAACCAAGTT 300 

CTTC CAGACT AAAGAATTAA. GGTAACATCA ATACCTAGGC CTGAGAAATA ACCCCATCCT 3 60 

TGTTGGGCAG CTCCCTGCTT TGTCCTGCAT GAA.CAGAGTT GATGAAAGTG GGGTGTGGGC 420 

AACAAGTGGC TTTCCTTGCC TACTTTAGTC ACCCAGCAGA GCCACTGGAG CTGGCTAGTC 430 

40 CAGCCCAGCO ATGGTGCATG ACTCTTCCA.T AAGGGATCCT CACCCTTCCA CTTTCATGCA 540 

AGAAGGCCCA GTTGCCA.CAG ATTATACAAC CA.TTACCCAA ACCAGTCTGA CAGTCTCCTC 600 

CAGTTCCAGC AATGC CTAGA GACATGCTCC CTGCCCTCTC GACAGTGCTG CTCCCCACAC 660 

45 

CTAGCCTTTG TTCTGGAAAC CCCAGAGAGG GCTGGGCTTG ACTCATCTCA GGGAATGTAG 720 

CCCCTGGGCC CTGGCTTAAG CCGACACTCC TGACCTCTCT GTTGACCCTG AGGGCTGTCT 730 

50 TGAAGCCCGC TACCCACTCT GAGGCTCCTA GGAGGTACCA TGCTTCCCAC TCTGGGGCCT 340 

GCCCCTGCCT AGCAGTCTCC CAGCTCCCAA CAGC CTGGGG AAGCTCTGCA CAGAGTGACC 900 

TGAGACCAGG TACAGGAAA.C CTGTAGCTCA ATCAGTGTCT CTTTAACTGC ATAAGCAA.TA 960 

55 

AGATCTTAA.T AAAGTCTTCT AGGCTGTAGG GTGGTTCCTA CAACCAGAGC CAAAAAAAAA IQ20 

AAAAA 1025. 

60 
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(2) INFORMATION FOR SEQ ID MO: 2 30: 

- (i> SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1250 base pairs 
(3) TYPE: aucisic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY : Linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 23 Q: 

GCCCACGCGT CCGCCCACC-C GTCCGGCGGT GCGGAGTATG GGGCGCTGAT GGCCATGGAG 60 

15 GGCTACTGGC GCTTCCTGGC GCYGCTGGGG TCGGCACTGC TCGTCGGCTT CCTGTCGGTG 120 

ATSTTCGCCC TCGTCTGGGT CCTCCACTAC CGAGAGGGGC TTGGCTGGGA TGGGAGCGCA 130 

CTAjGAGTTTA ACTGGCACCC AGTGCTSA.TG GTCACCGGCT TCGTCTTCAT CCAGGGCATC 240 

20 

GCATCATCGT CTACAGACTG CCGTGGACCT GGAAATGCAG CAA.GCTCCTG ATGAAATCCA 300 

TCCATGCAGG GTTAAA.TGCA GTTGCTGCCA TTCTTGCAAT TATCTCTGTG GTGGCCGTGT 3 SO 

25 TTGAG AAC CA CAATGTTAAC AATATAGCCA ATATGTACAG TCTGCACAGC TGGGTTGGAC 420 

TGATAGCTGT CATATGCTAT TTOTTACAGC TTCTTTCAGG TTTTTCAGTC TTTCTGCTTC 430 

CATGGGCTCC GCTTTCTCTC CGAGCATTTC TCATGCCCAT ACATGTTTAT TCTGGAATTG 540 

30 

TCATCTTTGG AACAGTGATT GCAACAGCAC TTATGGGATT GACAGAGAAA CTGATTTTTT 600 

CCCTGAGAGA TCCTGCATAC AGTACATTCC CGCCAGAAGG TGTTTTCGTA AATACGCTTG 660 

35 GCCTTCTGA.T CCTGGTGTTC GGGGCCCTCA TTTTTTGGAT AGTCACCAGA CCGCAATGGA 720 

AACGTCCTAA GGAGCCAAAT TCTACCATTC TTCATCCAAA TGGAGGCACT GAACAGGGAG 730 

' CAAGAGGTTC CATGCCAGCC T ACTCTGGCA ACAACATGGA. CAAATCAGAT TCAGAGTTAA. 340 

40 

ACARTGAAGT AGCAGCAAGG AAAAGAAACT TAGCTCTGGA TGAGGCTGGG CAGAGATCTA. 900 

■ CCATGTAAAA TGTTGTAGAG A.TAGAGCCAT ATAACGTCAC GTTTCAAAAC TAGCTCTACA 960 

45 GTTTTGCTTC TCCTATTAGC CATATGATAA TTGGGCTATG TAGTATCAAT ATTTACTTTA • 1020 

A.TCACAAAGG ATGGTTTCTT GAAA.TAATTT GTATTGATTG AGGCCTATGA ACTGACCTGA 1030 
ATTGGAAAGG A.TGTGATTAA TATAAATAAT AGCAGATATA AATTGTGGTT ATGTTACCTT ' 1140 

50 

TATCTTGTTG AGGACCACAA CATTAGCACG GTGCCTTGTG CAKAA.TAGAT ACTCAATATG 1200 

TGAATATGTG TCTACTAGTA GTTAATTGGA TAAACTGGCA GCATCCCTGA 1250 

55 

(2) INFORMATION FOR SEQ ID NO: 231: 
60 (i) SZQISEZICZ CHARACTERISTICS : 
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(A) LENGTH: IS II base pairs 
(3) TYPE: nucleic acid 

(C) STPANDEDNES5 : double 

(D) TOPOLOGY: linear 

5 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 23 1 : 
CMGNCAGTAC CGGTCNGATT CCCGGGTCGA CCCACGCGTC CGCTGCATTC OAGGGCCTTT . oO 

10 C AGT GGCTTT CATTCTGAAG TTCCTGGATA ACATGTTCCA TGTCTTGATG GCCCAGGTTA 120 

CCASTGTCAT TA_TC.AC.-ACA GTGTCTGTCC TGGTCTTTGA CTTCAGGCCC TCCCTGGAAT 130 

TTTTCTTGGA AGCC3CATCA GTC STYCTCT CTATATTTAT TTATAATGCC AGCAAGCCTC 240 

15 

AAGTTCCGGA ATACGCACCT AGGCAAGAAA GGATCCGAGA TCTAAGTGGC AATCTTTGGG 300 

AGCGTTCCAG TGGGGATGGA GAAGAACTAG AAAGACTTAC C-AACCCA\G AGTGATGAGT 360 

20 CAJGATGAAGA TACTTTCTAA CTGGTACCCA CATAGTTTGC AGCTCTCTTG AAOCTTATTT 420 

TCACATTTTC AGT G T TT GTA ATATTTATCT TTTCACTTTG ATAAACCAGA AATGTTTCTA 430 

AATCCTAATA TTCTTTGCAT ATATCTAGCT ACTCCCTAAA. TGGTTCCATC CA^GCTT^G ' 540 

AGTACCCAAA GGCTAAGAAA TTCTAAAGAA [CTGATACAGG AGTAACAATA TGAAGAATTC 500 

ATTAATATCT CAGTA.CTTGA TAAATCAGAA AGTTATATGT GCAGATTA.TT TTCCTTGGCC 6 SO 

30 TTCAAGCTTC CAAAAAACTT GTAATAATCA TGTTAGCTAT AGCTTGTATA TACACATAGA 720 

"GATCAATTTG C CAAAT ATTC ACAATCATGT AGTTCIAGTT TACATGCCAA AGTCTTCCCT 730 

TTTTAACATT ATAAAAGCTA GGTTGTCTCT TGAATTTTGA GGCCCTAGAG ATAGTCATTT 340 

35 

TGCAAGTAAA GAGCAACGGG ACCCTTTCTA AAAACGTTGG TTGAAGGACC TAAATACCTG 900 

• GCCATACCAT- AGATTTGGGA. TGATGTAGTC TGTGCTAAAT ATTTTGCTGA AGAAGCAGTT 960 

40 TCTCAGACAC AACA.TCTCAG AATTTTAATT TTTAGAAATT CATGGGAAAT TGGATTTTTG 1020 

TAATAATCTT TTGATGTTTT AAACATTGGT TCCCTAGTCA CCATAGTTAC CACTTGTATT ■ 1080 

TTAAGTCATT TAAACAAGCC ACGGTGGGGC TTTTTTCTCC T C AGTTTGAG GAGAAAAATC 1140 

45 

TTGATGTCAT TACTCCTGAA TTATTACATT TTGGAGAATA AGAGGGCATT TTATTTTATT 1200 

AGTTACTAAT TCAAGCTGTG ACTATTGTAT ATCTTTCCAA GAGTTGAAAT GCTGGCTTCA ' 1260 

50 GAATCATACC AGATTGTCAG TGAAGCTGAT GCCTAGGAAC TTTTAAAGGG ATCCTTTCAA 13 20 

AAGGATCACT TAGCAAACAC ATGTTGACTT TTAACTGATG TATGAATATT AATACTCTAA 1330 

AAATAGAAAG ACCAGTAATA TATAAGTCAC TTTACAGTGC TACTTCACAC TTAAAAGTGC 144Q 
55 - 

ATGGTATTTT TCATGGTATT TTGCATGCAG CCAGTT.AACT CTC-GTAGATA GAGAAGTCAG 15 00 

GTGATAGATG ATATTAAAAA TTAGC-AACA AAAGTGACTT GCTCAGGGTC ATGCAGCTGG 1560 

60 GTGATGATAG AAGAGTGGGC TTTAACTGGC AGGCCTGTAT GTTTACAGAC TAOCATACTG 1 620 



WO 98/54963 



435 



TAA a .TA.TGAG CTTTATGGTG TCATTCTCAG AAACTTATAC ATTTCTGCTC TCCTTTCTCC 
TAAGTTTCAT GCAGATQ.-AT ATAAGCTAAT ATACTATTAT ATAATTCATT TGTGATATCC 
AOVATAATAT GACTGGOAG AATTGGTGGA A-.TTTGTAAT TAAAATAATT ATTAAACCTA 
AAAAAAAAAN N 

(2) INFORMATION FOR SHQ ID NO : 232: 

(i) SEQUENCE CiAHACTEklSTICS : 

(A) LENGTH: 2271 base pairs 
(3) TYPE : nucleic acid 

(C) STRANDEDNESS : double 

(D) TOCOLOGY : linear. 

(xi> SEQUENCE DESCRIPTION: SEQ ID NO : 222: 
CTGACCTCAT GGCGTAGAGC CTA3CAACAG CGCAGGCTCC CAGCCGAGTC CGTTATGGCC 
GCTGCCGTCC CGSAGAGGAT GAGGGGGCCA GCACAAGCGA A^CTGCTGCC CGGGTCGC-CC 
ATCCAAGCCC TTGTGGGGTT GGCGCGGCCG CTGGTCTTGG CGCTCCTGCT TGTGTCCGCC 
GCTCTATCCA GTGTTGTATC ACGGACTGAT TCACCGAGCC C-ACCGTACT CAACTCACAT 
ATTTCTACCC CAAATGTGAA TGCTTTAACA CATGAAAACC AAACCAAACC TTCTATTTCC 
CAAATCAGCA CCACCCTCCC TCCCACGACG AGTACCAAGA AAAGTGGAGG AGCATCTGTG 
GTCCCTCATC CCTCGCCTAC TCCTCTGTCT CAAGAGSAAG CTGATAACAA. TGAAGATCCT 
AGTATAGAGG AGGAGGATCT TCTOATGCTG AACAGTTCTC CATCCACAGC CAAAGACACT 
CTAGACAATG GCGATTATGG AGAACCAGAC TATGACTGGA CCACGGGCCC CAGGGACGAC 
GACGAGTCTG ATNGACACCT TGGAAGAAAA CAGGGGTTAC ATGGAAATTG A-.CAGTCAGT 
GAAATCTTTT AAGATGCCAT CCTCAAATAT AGAAGAGGAA GACAGCCATT TCTTTTTTCA 
- TCTTATTATT TTTGCTTTTT GCATTGCTGT TGTTTACATT ACATATCACA ACAAAAGGAA 
GATTTTTCTT CTGGTTCAAA GCAGGAAATG- GCGTGATGGC CTTTGTTCCA A-ACAGTGGA 
ATAC CATCGC CTAGATCAGA ATGTTAATGA GGCAATGCCT TCTTTGAA.GA TTACCAATGA 
TTATATTTTT TAAAGCACTG TGATTTGAAT TTGCTTATGT AA.TTTTATTT GCTTGACTTT 
TTATATGATA TTGTGCAAAT GTTTGCCATA GGCAATTGGT ACTTAAATGA GAGGTGAGTC 
TCTCTTTTGC CTTGGTGCTT TGGAAATTAA ATGTCACAAA CGAGTATATA ATTTTTTATC 
TGTACTTTTA GAGCTGAGTT TAA.TCAGGTG TCCAAAATGT GAGTTAAACA TTACCTTATA 
TTTACACTGT TAGTTTT7AT TGTTTTAGAT TTATTATGCT TCTTCTGGAA GT ATTAGTGA 
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TGCTACTTTT AAAAGATCCC AAACTTGTAA CTAAATTCTG ACATATC7GT TACTGCTGAC 1200 

TCACA.TTCAT TCTCCGCCAT TCAAATACTA TTTTTTATCC ACATTTTTTT TTGTTCCCAA 1260 

5 ACTGTAATGT ACAA3GATAT GTGTGATAAT GCTTTGGATT TGAG7AATAT TTTTTTTTCT 1320 

TCCAAjGAAAA CTGCTTTGGA TATTTTTAGA TAATTTAAAC ATAATTTAGG ATAATGATAT 1330 

TGCTCAATCT GACCACAATT TTAGGTAAAA CATTAAATGT GTCAAGAP-AT CTTGGCAACA 1440 

10 

GAGACTCTGC AGCTTGCAGT GGACATAGAT AAAATGTTAC AGAGATACTA TTTTTTTGGT 1500 

TGGAATTACT ATATTAAATT TAGAAGCAGA AACTGGTAAA ATGTTAAATA CATGTACAAT 1560 

15 TGCTTTTAGT TAGCAATTGA TTGTAGCATG GGTTCCTCCA AGGTTTCAAG CAATGGGCAG 1520 

AGTTTAAAAT TATATCAGAT TCGTTTACTT CGTTTATTAT TTTACAGTAA ATTTGAATAA 1530 
ATCTTAGGGG TCATTATCAC TTAAATAATA CTGTACCTAG GTCTTTCAAA TTJVAAATTAT ' 1740 

20 

ACCTGAATGA AGTTGTTTGT ATACATAAAG GATATTTGTG TACAATTACC TTTTTTCCCC 1300 
CACACTTGTT TTCTTTGTTT TTGTTTTTTA TGGCAACTGG AAAGTATTTA CTATGGGATT - -1360 

25 CATTTATGTC TGTCTTTCTA TCATAAAGAA TTGATCAATA TGTAAATATG TGATTTGAAC 1320 

CAT GGTTG AC TTACAAGTGT CACTAOGCT TTTTAGAAAA CATAGCCCTA ATATATGTTA 1930 

AGCAGGACCC GGGTGAGCCA GTGGGCTTGC GCTTTATGTA GAGCTGGAAG AAGGCCGTCC 2040 

30 

ATCCTGTCTC TTGGGCGGAC AGTGTACTTT CCTAATAGGG AAGGGAAGCA CAATGGAAAT 2100 

ACCCCTGAAC CGTTTTATTG CAGTAATTTT TTTCATATCT GAAACTATTA TTTAATATTT 2 ISO 

35 TGAATAAGAT TTTAAAAAAT AAATGGCAAA GATATAAATC TAAAAAAAAA AAAAAAAAAA 2220 

AAAi^AAAAAA AAAAAAAAAA AAA=AAAAAA AAAAAAAAAA AAAAAAMAMA N 2271 



40 

(2) IMFOP^ATXON FOR SEQ ID MO: 23 3: 

(i) SEQUENCE CKARACTEPZ3TIC3 : 
45 ■ (A) LENGTH: 1333 base pairs 

(3) TT?E: nucleic acid 

(C) STRANDEDNES3 : double . 

(D) TOPOLOGY : linear 

50 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 23 3: 

CTTCCGGTTC TCCGGGCAGC TGCCACTGCT GTAGCTTCTG CCACCTGCCA OGACCGGGCC 50 
TCTCCCTGGC GTTTGGTCAC CTCTGCTTCA TTCTCCACCG CGCCTATGGT CCCTCTTGGA • 12 Q 

55 

GCCAGCGTGG CGNGCCTGGC GGCTCCCGGG TGGTGAGAGA GCGGTCCGGG A^.CGATGAAG 130 

GCGTCGCAGT GCTGCTGCTG TCTCAGCCAC CTCTTGGCTT CCGTCCTCCT CCTGCTGTTG 240- 

60 CTGCCTGAAC TAAGCGGGYC CCTGGMAGTC CTGCTGCAGG CAGC CC- AC-C-C CGCGCCACGT 300 
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YTTGGGCCTC CTGACCCTAG ACC?.GGACAT TACCC-CCGCT GCCACCGGGC CCTWACCCCT 260 

GCCCAGCAC-C CGGGCCGTGG TCTGGCTGAA GCTGCGGGGG CCGCGGGGCT OCGAGGGAGG 420 

5 * 

OATGGCAGC AACCCTGTGG CCGGGCTTGA GACGGACGAT CACGGAGGGA AGGCCGGGGA 430 

AFJGCTCGGTG GGTGGCGGCC TTGCTGTGAG CCCCAACCCT GGCGACAAGC CCATGACCCA 540 

10 GCGGGCCCTG ACCGTGTTGA TGGTGGTGAG CGGCGCGGTG CTGGTGTACT TCGTGGTCAG 600 

GACGGTCAGG ATGAGAAGAA. GAAACCGAAA GACTAGGAGA TATGGAGTTT TGGACACTAA 560 

CATAGAAAAT ATGGAATTGA OCCTTTAGA ACAGGATGAT GAGGATGATG ACAACACGTT 720 

15 

GTTTGATGCC AATCATCCTC GAAGATAAGA ATGTGCCTTT TGATGAAAGA ACTTTATCTT 7S0 

TCTACAATGA AGAGTGGAAT TTCTATGTTT A-JGGAATAAG APvGCCACTAT ATCAATGTTG 340 

20 GGGGGGTATT TAAGTTACAT ATATTTNAAC AA.CCTTTAAT TTGCTGTTGC AATAAATACC 900 

GTATCCTTTT ATTATATCTT TATATGTATA GAAGTACTCT GTTAATGGCC TCAGAGATGT 960 

TGGGGATAAA GTATACTGTA ATAATTTATC TGTTTGAAAA TTACTATAAA ACGGTGTTTT 1020 

25 

CTGRTCGGTT TTTGTTTCCT GCTTACCATA TGATTGTAAA TTGTTTTATG TATTAATC.-G 1080 

TTAATGCTAA TTATTTTTGC TGATGTCATA TGTTAAAGAG CTATAAATTC CAACAACC>A 1140 . 

30 CTGGTGTGTA AAAATAATTT AAAATYTCCT TTACTGAAAG GTATTTCCCA TTTTTGTGGG 1200 

GAAAAGAAGC CAAATTTA.TT ACTTTGTGTT GGGGTTTTTA AAATATTAAG AAATGTCTAA 1260 

GTTATTGTTT GCAAAAC-A.T AAATATGATT TTAAATTCTC TTAAAAAAAA AAAAAArAAC 13 20 

35 

CCCGGGGGGG GGCCCGGN 1335 



40 

(2) INFORMATION FOR SEQ ID MO : 234: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 31 amino acids 
45 (3) TYPE: amino acid 

(D) TOPOLOGY: linear 
(xi) SEQUENCE DESCRIPTION : SEQ ID NO: 23 4: 

Mec Leu Ser Thr Giv lie Glu Vai Ala Arg Pro Pro Ala Thr Leu Leu ' 
50 I 5 10 15 

Gly Leu Mec ?b_e Vai Leu Thr Gly Mec Pro Arg Giy Leu Arg ICaa 
20 25 30 

55 

(2) INFOF-tfATIQN FOR SEQ ID NO: 235: 

(i) SEQUENCE CHARACTERISTICS : 
60 (AJ LENGTH: 115 amino acids 
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(3) TYPE : amino acid 
(0) TOPOLOGY: linear 
(xi) SEQUENCE DESCRIPTION: SEQ ZD NO: 23 5: 

5 M 2 r */aI Val lie Vai lis lie Leu ?he Ser Phe Asp Ser Val Gly 

1 S 10 13 

Thr Mec Phe Ser Cys Asn Arg' lie Pro Lys He Thr Val Leu Asn Lys 
20 25 30 

10 

Leu Lys Phe Xaa Cys Glu Val Leu Leu Arg He Gin Thr He Gin Gly 
35 40 45 

Phe Tyr Arg Cys Thr Arg lie Ser Arg Tyr Lys Gly He Phe Pro Asp 
15 50 55 60 

Phe Cys Gin Ser Gin Cys dec Gly Cys Asn Fro Glu Ser Xaa Mec Ala 
65 70 75 80 

20 Vai Pro Ala Leu Val Thr Pro He Leu Ala His Arg Lys Lys Glu Lys 

85 90 55 

Gly Mec Cys Leu Phe Thr Leu He He Ala Pro Thr Arg Cys Thr His 
100 105 HQ 

25 - 

Tyr Phe Cys Xaa 
115 



30 



(2) IMF O PMAT ION FOR SEQ ID NO : 236: 



(I) SEQUENCE CHARACTERISTICS : . 

(A) LENGTH : 103 amino ■ acids 
35 • (B) TYPE : amino acid 

(D) TOPOLOGY: linear 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 23 a: 

Mec Ser Ser Ala Lys He Val Arg Gin Arg Gly Ala Val Pro Thr Tyr 
40 I 5 10 15 

Tyr Thr Thr Glu Ala Gly Glu He He Phe Leu Vai Leu Asn Trp Ser 
20 ' 25 20 

45 Leu Ser He Leu His He Val Asp Val Leu Cys Ser Lys Pro Glu Lys 
3 5 40 45 



50 



Ser Val Thr Glu Asp Ala Ala Ser Gly Leu Ser Gin Arg Mec Thr Ala 
50 55 SO 

Leu Val Trp Arg Lys Gly Pro Asp Gly Gly Ser Arg Lys Pro He Leu 

55 7.0 75 SO 



Leu Leu Phe Phe Phe Leu Pro Leu lis Leu Cys Phe Kis Ser Phe lis 
55 35 90 95 



Kis Ser Ser Asn He Cys Xaa 
100 



60 
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(2) INFORMATION FOR SEQ ID MO: 237: 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 42 amine eci.es 

(3) TYPE: amino acid 

(D) TOPOLOGY: linear 
( :ci ) SEQUENCE DESCRIPTION: SEQ ID NO: 237: 

Mec lie Leu Phe" Pro Gin Xaa Ala Leu Arc , Leu Gly Xaa Trp Fro Arg 
I 5 10 " " 15 

Thr Trp Ser He Leu Xaa Lys Tyr Ser Vai Asn Phe Phe Ser Ala Tyr 
20 2S" 30 

Ser Pro Met Gly Ala Val Gly Thr Glu Phe 
35 40 



(2) INFORMATION FOR SEQ ID NO: 233: 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 3 7 amino acids 

(3) TYPE: amine acid 

(D) TOPOLOGY : linear 
(xi) SEQUENCE DESCRIPTION :' SEQ ID MO: 23 3: 

Mec He He Leu Leu Leu Phe Mec Leu Leu Asn Asn Val Val Leu Vai 
13 10 15 

Gin Glu Asp Asn Cys Gin Arg Lys Asn Thr Val Gin Glu Arg Arg Xaa 
20 25 30 

Trp Ser Gin Trp Xaa 
35 



(2) INFORMATION FOR SEQ ID MO: 239: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 123 amino acids 

(3) TYPE: amino acid. 

(D) TOPOLOGY : linear 
(:ci) SEQUENCE DESCRIPTION: SEQ ID MO: 23 9: 

Mec Ala Ala Xaa Pro Pro Gly Cys Thr Pro Pro Xaa Leu Leu As? He 
15 10 15 

Ser Trp Leu Thr Glu Ser Leu Gly Ala Gly Gin Pro Val Pro Val Glu 
20 25 30 

Cys Arg His Arg Leu Glu Vai Ala Gly Pro Arg Lys Gly Pro Leu Ser 
35 40 . 45 

Pro Ala Trp Mec Pro Ala Tyr Ala Cys Gin Arc Pro Thr Pro Leu Thr 
50 55 60 



His His Asn Thr Gly Leu Ser Glu Leu Leu Glu His Gly Val Cys Glu 
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65 70 75 - 30 

Glu Val Glu Arg Val Arg Arg Ser Glu Arg Tyr Gin Thr Mec Lys Val 
as 90 95 

Arg Arg Ala Gly Leu Gly Pro Thr Pro Giy Mec Ser Cys Pro Gly Asn 
100 105 110 

Asp Asn Thr Val His Thr Mec His Gly Glu Aia Asn Arg Gly Ser :Caa 
• 115 120 125 



(2) INFORMATION FOR SEQ ID MO: 24Q : ■ 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 67 amino acids 

(E) TYPE: amino acid 

(D) TOFOLCG" C : linear 
(:ci) SEQUENCE D ESC ?-l PT ION : SEQ ID 'NO: 240 : 

Mec Ser He Leu Cys Cys Pro Xaa Leu Cys Leu Phe Phe Ser ?he Cys 
I '5 10 15 

He Ser Ser Gly "Ser Cys Pro Phe Ser His Val Ser Gin Leu Ser Phe 
20 " 25 30 

lie Ala Thr Phe Ser Gin Ser Ser Pro Val Leu Leu Val Pro Ala Tyr 
35 40 45 

Asn Thr Tyr Leu Ser Phe Leu AJ.a Phe Leu Asp Cys Ala Ser Leu Thr 
50 -55 60 

Ser Thr Xaa 
65 



(2T INFORMATION FOP. SEQ ID NO: 241: 

(i) SEQUENCE CHARACTERISTICS: ■ 

(A) LENGTH: 69 amino, acids 

(3) TYPE: amino acid 

(D) TOPOLOGY: linear 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 241: 

Mec Ser Thr Phe Gin Leu Leu Leu Leu He Leu Ala Gin Ser Thr Tyr 
1-5 10 15 

Lys He Lys Ser Lys Pro Leu His Mec Thr Asn His Thr Leu Leu Asn 
20 25 30 

Ser Pro Gly Leu Asn Pro Ser Ser Pro Thr Leu Asn Phe Lys Thr Gin 
35 40 45 

Gin His Glu Ser Val Ser Tyr Aia Cys Cys His Met Arg Ser Leu His 
50 55 60 
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His Ala Phe Ala Xaa 
65 



(2) INFORMATION FOR SEQ ID NO: 242: 

(i) SEQUErlCE C>1hRACTERISTIC3 : 
10 - (A) LENGTH: 44 amine acids 

(3) TYPE: amino acid 

(D) TOPOLOGY : linear 
(:ci) SEQUENCE DESCRIPTION: SEQ ID MO: 242: 

lD Mec Vai Ser Vai Vai Leu I la Phe Ser Phe Leu Ser Lau Thr I la Sar 
I 5 10 15 ■ 

Thr Thr Ala Sar Ala Tyx Asn Gly Asn. Asp Thr Gin Gly Trp Asn Asp 
20 25 30 



20 



25 



Lys Phe His Xaa Xaa Sar Vai Lys Thr Gin Thr Xaa 
35 40 



(2) INFO RMAT ION FOR SEQ ID NO: 242: 



. CD SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 51 amino acids 
30 (3) TYPE : amino acid 

(O) TOPOLOGY: linear 
<xi) SEQUENCE DESCRIPTION: SEQ ID MO: 243: 

fcec lie Ser Asp. Ala Gly Ala Gly ?he Gly Vai ?he Leu Lau Vai Pro 
35 1 5 10 15 

Arg Ala Gly His Cys Trp Gly Ala Gly Lys Pro Leu Pro Ser Cys Pro 
'20 25 30 

40 Ser Vai Ala Ser lie Pro Ser Trp Vai Leu Pro Ser Phe Leu Glu Arg 

.35 .40 45 



45 



Gly Arg Xaa " 
50 ' 



(2) INFORMATION FOR SEQ ID NO: 244: 

50 (i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: . 42 amino acids 
(3) TYPE: amino acid 
(D) TOPOLOGY: linear 
Cxi) SEQUENCE DESCRIPTION: SEQ ID NO: 244: 

55 

Mec Vai Gin Thr He Gin Asp Phe Leu Ser Lau Phe Ser Thr Pro- He 
1.5 10 15 

Phe Leu Leu Leu Leu- Men Phe Glu Thr Leu Ser Leu Ala Pro Ala Trp 
60 20 25 30 
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Leu Lys Pro Leu >r; Val Thr Ser His Ser >laa 
25 , 41 



(2) ISPORMATICI-i SIP. SZ' TD : 145: 

(i) szqcE:rcz Cw_-m?j sri =3 : 

(A) L3Z33TH: €1 " acids 
(3) TYPE : 2rdr-o ario 
(D) TCP1L1GV: linear 
(xi) .SEQuZ^CZ -PT.-i:: SZQ — 3 l':Z : 245: 

Hec lie Leu Me, Pro Gly Leu * lly Thr 5er Arg Gin --org Ser val Pro 

i 5 i: 15 

Phe-Vai Pre Thr Leu Am Ale 3er Thr Pro Gly Ale Mec Thr Gly Pro 
■ 20 25 30 

Thr Ale Thr Leu Thr Ser Tys • 2- In Trp Thr Thr Ale Cys Arg val Ser 

35 . "4Z ^ 45 

Trp Ala Asn Gly Trp Tor Ser leu Arg Thr Pha Arg ICae 
50 • 55 50 



(2) rMFORHATIC:r ?C2 S3" ZD IJZ : 245 : 

(i) SZQTHTCS: CI-r-^ACTZP-— IC- 
(A) LZIJGTH: 3€ eruu-.c 



(3) TV7Z: arunr arid 
Cxi) SZ^uTEZ-CH TZSTRIPTll^ -. 3ZQ ZD 17Z: 245: 

Mec Ser His His Ala 31r. Arg Pha Leu Leu lie Thr Ma" Leu Leu 

1 5. 13- 15 

Gin Giu Ale Lys Pro Val Ser Asn Ha Pro His leu Leu Giu Ser Trp 
20 25 30 

Tyr Phe Gly Xaa 
35 



• (2) INFORMATION ?C?. 3ZQ ID :r : 247: 

(i) SEQUENCE uHA?AJ2T3T=ISTICS: 

(A) LZZZTTE: 2=.a=iTw acids 
(3) Tl~: amino acid 

_(:ci) S3QCJ32TCZ IZSCPHPtlUTI : SXQ-XDirQ: 247: 

Mec Asn Ser Leu Phe Irg Mec lie Leu Leu Pro val Ser Gin Asp Gin 
1 5 10 15 

Val Val Giu Gly Leu Gin Gly Sly Pha Ser C-lr. lie His Mec Arg He 
20 21 30 
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493 



Leu Arg Lys His Leu Xaa 
35 



(2) INFORMATION FOR SEQ ID MO: 2 48: 

(i) SEQUENCE CI^AFACTERIST ICS : 

(A) LENGTH : 211 amino acids 

(3) TYPE; amino acid 

(D)' TOPOLOGY: linear 
(Xi) SEQUENCE DESCRIPTION: SEQ ^ ID NC : 248 : . 

Mec Ser Arg Ser Xaa Asp Val Thr Asn Thr Thr Phe Leu Leu Mec Ala 
I 5 10 15 

Ala Ser lie Tyr Leu His Asp Gin Asn Pro Asp Ala Ala Leu Arg Ala 
20 25 30 

Leu His Gin Giy Asp Ser Leu Giu Cys Thr Ala Met Thr Val Gin lie 
3S 40 .45 

Leu Leu Lys Leu Asp Arg Leu Asp Leu Ala Arg Lys Giu Leu Lys Arg 
50 55 ■ "60 

Met Gin Asp .Leu Asp Giu Asp Ala Thr Leu Thr Gin Leu Ala Thr Ala 
55 70 75 • 30 

Trp Val Ser Leu Ala Thr Gly Giy Giu Lys Leu Gin Asp Ala Tyr Tyr 
35 .90 55 

lie ?he Gin Giu Mec Ala Asp Lys -Cys Ser Pro Thr Leu Leu Leu Leu 
100 105 110 

Asn Giy Gin Ala Ala Cys His Met Ala Gin Gly -Arg Trp Giu Ala Ala 
115 • 120 125 

Giu Giy Leu Leu Gin Giu Ala Leu Asp Lys Asp Ser Gly Tyr Pro Giu 
130 135 140 

Thr Leu Val Asn Leu He Val Leu Ser Gin His Leu Gly Lys Pro Pro 
145 150 155 150 

Giu Val Thr Asn Arg Tyr Leu Ser Gin Leu Lys Asp Ala His Arg Ser 
155 170 . 175 

His Pro Phe lie Lys Giu Tyr Gin Ala -Lys Giu Asn Asp Phe Asp Arg 
180 135 190 

Leu Val Leu Gin Tyr Ala Pro Ser Ala Giu Ala Gly Pro Giu Leu Ser 
195 " ? 200 205 

Gly Pro Xaa 
210 



(2) INFOP.MATION FOR SEQ ID NO: 249 : 
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(i) SEQUENCE C»-RACTER1STICS : 

(A) LENGTH: 543 amino acids 

(3) TYPE: amino acid 

(D) TOPOLOGY : linear 
(:cL) SEQUENCE DESCRIPTION: SEQ ID NO: 249: 

Mec Giu Asp 3er GIu Ala Leu Gly ?he GIu. Kis Mec Gly Leu Asp Pro 
15 10 15 

Arg Leu Leu Gin Ala Val Thr Asp Leu Gly Trp Ser Arg Pro Thr Leu 
20 25 30 

lis Gin GIu Lys Ala lie Pro Leu Ala Leu Giu Gly Lys Asp Leu Leu 
3 5 , 40 45 

Ala Arg Ala Arg Thr Gly Ser Gly Lys Thr Ala Ala Tyr Ala lie Pro 
50 " 55 60 . 

Mec Leu Gin Leu Leu Leu His Arg Lys Ala Thr Gly Pro Val Val Giu 
a5 70 75 30 

Gin Ala Val Arg Gly Leu Val Leu Val Pro Thr Lys Giu Leu Ala Arg 
85 90 95 

Gin" Ala Gin Ser Ken lie Gin Gin Leu Ala Thr Tyr Cys Ala Arg Asp 
I0Q f05 110 

Val Arg Val Ala Asn Val Ser Ala Ala Giu Asp Ser Val Ser Gin Arg - 
115 120 125 

Ala Val Leu Mec Giu Lys Pro Asp Val Val Val Gly Thr Pro Ser Arg 
130 135 140 

lie Leu Ser His Leu Gin Gin Asp Ser Leu Lys Leu Arg Asp Ser Leu 
145 150 155 160 

Giu Leu Leu -Val Val Asp Giu Ala Asp Leu Leu Phe Ser Phe Gly Phe 
- 1S5 ■ 170 175 

GIu Giu GIu Leu Lys Ser Leu Leu Cys His Leu Pro Arg lie Tyr Gin 
130. 135 190 

Ala Phe Leu Mec Ser Ala Thr Phe Asn Giu Asp Val Gin Ala Leu Lys 
195 • 200 205 

Giu Leu He Leu Kis Asn Pro Val Thr Leu Lys Leu Gin Giu Ser Gin 
210 • 2IS 220 

Leu Pro Gly Pro Asp Gin Leu Gin Gin Phe Gin Val Val Cys .Giu Thr 
225 230 235 240 

Giu Giu. Asp Lys. Phe Leu Leu Leu Tyr Ala Leu Leu Lys Leu Ser Leu 
245 250 255 

He Arg Gly Lys Ser Leu Leu Phe Val Asn Thr Leu GIu Arg Ser Tyr 
250 265 , 270 

Arg Leu Arg Leu Phe Leu Giu Gin Phe Ser He Pro Thr Cys Val Leu 

275 230 235" 
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Asn Gly Glu Leu Pro Leu Arg Ser Arg Cys Kis lie He Ser Gin ?he 
290 295 300 

Asn Gin Gly Phe Tyr Asp Cys 7a 1 He Ale Thr Asp Ala Glu Val Leu 
5 305' 310 315 320 

Gly Ala Pro val Lys Gly Lys Arg A.rg Gly Arg Gly Pro Lys Gly Asp 
325 330 335 

10 Lys Ala Ser Asp Pro Glu Ala Gly Val Ala Arg Gly lie Asp ?he His 
340 345 350 

His Val Ser Ala Val Leu Asn Phe Asp Leu Pro Pro Thr Pro Glu Ala 
355 360 365 

Tyr lie His Arg Ala Gly Arg Thr Ala Arg Ala .Asn Asn Pro Gly lie 
370 375 330 



15 



Val Leu Thr Phe Val Leu Pro Thr Glu Gin Phe His Leu Gly Lys lie 
20 335 / 390 395 " 400 

Glu Glu Leu Leu Ser Gly Glu Asn Arg Gly Pro lie Leu Leu Pro Tyr 
405 410 415 

25 Gin Phe Arg Met Glu Glu lie Giu Gly Phe Arg Tyr Arg Cys Arg Asp 
420 4-25 430 



30 



45 



A.ia Mec Arg Ser Val Thr Lys Gin Ala lie Arg Glu Ala Arg Leu Lys 
435 440 445 

Glu tie Lys Giu Giu Leu Leu His Ser Giu Lys Leu Lys Thr Tyr Phe. 
450 455 460 



Glu Asp Asn Pro Arc Asp "Leu Gin Leu Leu Arg His Asp Leu Pro Leu 

35 465 470 475 480 

His - Pro Ala Val Val Lys Prp. His Leu Gly His Val Pro As? Tyr Leu 

435 490 495 

40 Val Pro Pro Ala Leu Arg Gly Leu Val Arg Pro His Lys Lys Arg Lys 

500 505 510 



Lys Leu Ser Ser Ser Cys Arg Lys Ala Lys Arg Ala Lys Ser Gin Asn 
515 520 , 525 

Pro Leu Arg Ser Phe Lys His Lys Gly Lys Lys Phe Arg Pro Thr Ala 
530 535 540 



Lys Pro Ser Xaa 
50 545 



(2) INFORMATION FOR SEQ ID NO: 250: 

55 

(i) SEQUENCE CHAiLACTEP-ISTICS : 

(A) LENGTH: 299 amino acids 
(3) TVPE: amino acid 
(D) TOPOLOGY: linear 
60 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 250: 



10 



20 



25 



40 



# 
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Mec Thr Thr Vai Pro Pro Ser Pro Arg Pro Mec Ser Arg Pro Ser Giu 1 
1 5 10 - IS 

Arg Asn Mec Arg Arg Pro Arg Giy Pro Ser Pro Leu Pro Ala Ser Pro 
20 25 30 

Arg Asn Ser Thr Pro As? Giu Pro Asp Vai His ?he Ser Lys Lys Phe 
3S ■ - dQ 45. 

Leu Asn Vai Phe Met Ser Gly Arg Ser Arg Ser Ser Ser Ala Giu Ser 
50 55 SO 



Phe Gly Leu Phe Ser Cys lie .-lie Asn Gly Giu Giu Gin Giu Gin Thr 
15 55 70 75 - 30 



His Arg Ala lie Phe Arg Phe Vai Pro Arg His Giu Asp Giu Leu Giu 
35 -90 95 

Leu Giu Vai Asp Asp Pro Leu Leu Vai Giu Leu Gin Ala Giu Asp Tyr 
100 105 ' 110 

Trp Tyr Giu Ala Tyr Asn Mec Arg Thr Gly Ala Arg Gly Vai Phe Pro 
115 120 125 

Ala Tyr Tyr Ala lie Giu Vai Thr Lys Giu Pro Giu His Mec Ala Ala 
13 0 13 5 140 



Leu Ala- Lys Asn Ser Asp Trp Vai Asp Gin Phe Arg Vai Lys Phe Leu 
30 , 145 150 155 160 

Gly Ser Vai Gin Vai Pro Tyr His Lys Gly Asn Asp Vai Leu Cys Aia 
165 170 175 

35 Ala Siec Gin Lys lie Ala Thr Thr Arg Arg Leu Thr Vai His Phe Asn 
130 135 190 



Pro Pro Ser Ser Cys Vai Leu Giu lie Ser Vai Arg Gly Vai Lys lie- 
195' • 200 205 

Gly Vai Lys Aia Asp Asp Ser Gin' Giu Aia Lys Gly Asn Lys Cys Ser 
210 215 220 



His Phe Phe Gin Leu Lys Asn lie Ser Phe Cys Giy Tyr His Pro Lys 
45 225 230 235 240 

Asn Asn Lys Tyr Phe Giy Phe lie Thr Lys His Pro Aia Asp His Arg 

245 2S0 255 ' 

50 Phe Ala Cys His Vai Phe Vai Ser Giu Asp Ser Thr Lys Aia Leu Ala 
2S0 265 270 



Giu Ser Vai Gly Arg Ala Phe Gin Gin Phe Tyr Lys Gin Phe Vai Giu 
275 230 235 

Tyr Thr Cys Pro Thr Giu Asp -lie Tyr Leu Giu 
290 295 



60 
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(2) INFORMAT ION FOR SEQ ID NO: 251: 

(i) SEQUENCE CI-^ACTEPJlSTXCS : 

(A) LENGTH: 40 amine acids 

(3) TYPE: amino acid 

(D) TQPCLCGY: linear 
Cxi) SEQUENCE DESCRIPTION: SEQ ID NO: 25 i: 

Leu Leu Tyr Leu Leu Lys Val Xaa Val He ?he Val ?he Ser Ser .Ser 
i 5 10 15 

Lys Giy Val Thr Leu Val Ser Mec Asn Leu Thr Ser Phe Phe Val Ser 
20 25 30 

Ser Val Leu Ala Cys Phe Ser Xaa 
35 " 40 



(2) INFORMAT I ON FOR SEQ ID NO: 252: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH : ' 594 amino acids 

(3) TYPE : amino acid 

(D) TOPOLOGY: linear 
(xi) SEQUENCE DESCRIPTION 1 SEQ ID MO: 252: 

Met: Pro Ala Ser Ser Leu Glu Ser Arg Ser Phe- Leu Leu Ala Lys Lys 
1 5 10 15 

Ser Gly Glu Asn Val Ala Lys Phe lie He Asn Ser Tyr Pro Lys Tyr 
20 25 30 

Phe Gin Lys 'Asp lie Ala Glu' Pro His He Pro Cys Leu Mec Pro Glu 
35 40 45 

Tyr Phe Glu Pro Gin He Lys Asp lie Ser Glu Ala Ala Leu Lys Glu 
50 .55 60 

Arg He Giu Leu Arg Lys Val Lys Ala Ser Val Asp Met -?he Asp Gin 
65 70 75 3.0 

Leu Leu Gin Ala Gly Thr Thr Val Ser Leu Giu Thr Thr Asa Ser Leu 
- 35 90 95 

Leu Asp Xaa Leu Cys 'Tyr Tyr Gly Asp Gin Glu Pro Ser Thr Asp Tyr 
100 - 105 110 

His Phe Gin Gin Thr Gly Gin Ser Giu Ala Leu Glu Giu Giu Asn Asp 
115 120 125 

Glu Thr Ser Arg Arg Lys Ala Gly His Gin Phe Gly Val Thr Trp Arg 
130 135 140 

Ala Lys Asn Asn Ala Glu Arg He Phe Ser Leu Mec Pro Glu Lys Asn 
145 150 155 ISO 

Glu His Ser Tyr Cys Thr Mec lie Arg Gly Mec Val Lys His Arg Ala- 
155 170 175 
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Tyr Giu Gin Ala Leu Asn Leu Tyr Thr Glu Leu Leu Asn Asn Arg Leu 
130 1S5 190 

His Ala Asp Vai Tyr Thr Phe Asn Ala Leu lie Glu Ala Thr Vai Cys 
135 200 205 

Ala lie Asn Glu Lys Phe Glu Glu Lys Trp Ser Lys lie Leu Glu Leu 
210- 215 220 

Leu Arg Kis Mec Vai Ala Gin Lys Val Lys Pro Asn Leu Gin Thr Phe 
225 230 235 240 

Asn Thr lie Leu Lys Cys Leu Arg Arg Phe His Vai Phe Ala Arg Ser 
245 250 255 

Pro Ala Leu Gin Vai Leu Arg Giu Wee Lys Ala lie Giy lie Glu Pro 
260- 265 270 

Ser Leu- Ala Thr Tyr His His He lie Arg Leu Phe Asp Gin Pro Giy 
275 230 235 

Asp Pro Leu Lys Arg Ser Ser Phe lie lie Tyr Asp .lie Met Asn Giu 
290 . 295 300 

Leu Mec Giy Lys Arg Phe Ser Pro Lys Asp Pro Asp Asp Asp Lys Phe 
305 , 310 315 320 

Phe Gin Ser Ala Met Ser He Cys Ser Ser Leu Arg Asp Leu Giu Leu 
325 330 - 335 

Ala Tyr Gin Val His Giy Leu Leu Lys Thr Giy Asp Asn Trp Lys Phe " 
340 , . 345 350 

lie Giy Pro Asp Gin His *Arg Asn Phe Tyr Tyr Ser Lys Phe Phe Asp 
355 360 ' 355 

Leu lie Cys Leu Met Glu Gin lie Asp Val Thr Leu Lys Trp Tyr Giu 
370 37~5 330 

Asp Leu He Pro Ser Ala Tyr Phe Pro His Ser Gin Thr Mec lie His 
385 390 395 4Q0 

Leu Leu Gin Ala Leu Asp Val Ala Asn Arg Leu Glu Vai lie Pro Lys 
405 410 415 

lie Trp Lys Asp Ser Lys Glu Tyr Giy His Thr Phe Arg Ser Asp Leu 
420 425 430 

Arg Glu: Giu lie Leu Mec Leu Met Ala Arg Asp Lys His Pro Pro Glu 
435 - 440 445 

Leu Gin Val Ala Phe Ala Asp Cys Ala Ala Asp lie Lys Ser Ma Tyr 
450 455 460 

Glu Ser Gin Pro He Arg Gin Thr Ala Gin Asp Trp Pro Ala Thr Ser 
465 470 475 430 

Leu Asn Cys lie Ala He Leu Phe Leu Arg Ala Giy Arg Thr Gin Glu. 

435 490 495 



